Difference between revisions of "JAR v.IS Project Findings"
Line 117: | Line 117: | ||
<p>We perform the transformation on the variables to make them more suitable for regression analysis. We perform a square root transformation as well as a natural logarithm transformation on all response and explanatory variables whose distributions are not normal to reduce skewness and yield a more normal distribution.</p><br> | <p>We perform the transformation on the variables to make them more suitable for regression analysis. We perform a square root transformation as well as a natural logarithm transformation on all response and explanatory variables whose distributions are not normal to reduce skewness and yield a more normal distribution.</p><br> | ||
− | [[File:Article_transformation.png|center| | + | [[File:Article_transformation.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Transforming the Response Variables and removing the outliers</div> | ||
+ | |||
<br> | <br> | ||
<p>The outliers for the explanatory variables are judged by the independent variable distributions as well as the scatterplots of the response variable against the explanatory variables. We remove the following data points (as circled in the figure) as outliers. </p><br> | <p>The outliers for the explanatory variables are judged by the independent variable distributions as well as the scatterplots of the response variable against the explanatory variables. We remove the following data points (as circled in the figure) as outliers. </p><br> | ||
− | [[File:outlier.png|center| | + | [[File:outlier.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Transforming the Explanatory Variables and removing the outliers</div> | ||
+ | |||
<br> | <br> | ||
Line 131: | Line 139: | ||
− | [[File:Bivfit.png|center| | + | [[File:Bivfit.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Bivariate fit of difficult words count. we select the SQRT transformation instead of the Ln transformation</div> | ||
<p> | <p> | ||
Line 144: | Line 155: | ||
<p>We also ran bivariate fit against all the 18 numerical explanatory variables to test for multicollinearity. The figure below shows the bivariate correlation scatterplot.</p> | <p>We also ran bivariate fit against all the 18 numerical explanatory variables to test for multicollinearity. The figure below shows the bivariate correlation scatterplot.</p> | ||
− | [[File:Bivfitscattermatrix.png|center| | + | [[File:Bivfitscattermatrix.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Bivariate correlation scatterplot matrix for all 18 numerical variables for the article model</div> | ||
<p>Using this scatterplot together with the bivariate correlation matrix, we eliminated 8 variables that are highly correlated. We ran Standard Least Squares regression on continuous numerical variables to verify the absence of multicollinearity in our remaining variables.</p> | <p>Using this scatterplot together with the bivariate correlation matrix, we eliminated 8 variables that are highly correlated. We ran Standard Least Squares regression on continuous numerical variables to verify the absence of multicollinearity in our remaining variables.</p> | ||
− | [[File:vifparamest.png|center| | + | [[File:vifparamest.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Final numerical explanatory variables estimate with VIF statistics</div> | ||
<p>As a result, we have the narrowed down version of our final list of numerical continuous explanatory variables to explain the variation of our response variables for the article regression model in preparation for the next step which is the stepwise regression.</p> | <p>As a result, we have the narrowed down version of our final list of numerical continuous explanatory variables to explain the variation of our response variables for the article regression model in preparation for the next step which is the stepwise regression.</p> | ||
Line 159: | Line 176: | ||
<br> | <br> | ||
− | [[File:artregeqn.png|center| | + | [[File:artregeqn.png|700px|center]] |
+ | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
+ | |- | ||
+ | |style="vertical-align:top;width:30%;" | <div style="background: #ffffff; text-align:center; line-height: wrap_content; text-align: center;font-size:12px">Article Regression equation for Ln(Total engagement)</div> | ||
{| style="width:100%; vertical-align:top; margin-top:5px;" | {| style="width:100%; vertical-align:top; margin-top:5px;" |
Revision as of 19:52, 23 April 2017
Click here to return to AY16/17 T2 Group List
Articles | Videos | R |
---|
Multiple Linear Regression Model What makes a good Facebook post? This section outlines the explanatory model on the article dataset from Facebook Insights supplemented with our crawled variables to form a holistic complete article dataset.
|