Difference between revisions of "JAR v.IS Project Findings"
Albertb.2013 (talk | contribs) |
Albertb.2013 (talk | contribs) |
||
Line 122: | Line 122: | ||
<br> | <br> | ||
<p>The outliers for the explanatory variables are judged by the independent variable distributions as well as the scatterplots of the response variable against the explanatory variables. We remove the following data points (as circled in the figure) as outliers. </p><br> | <p>The outliers for the explanatory variables are judged by the independent variable distributions as well as the scatterplots of the response variable against the explanatory variables. We remove the following data points (as circled in the figure) as outliers. </p><br> | ||
− | [[File:outlier.png|700px|center]] | + | |
+ | [[File:outlier.png|700px|center|frame|Transforming the Explanatory Variables and removing the outliers]] | ||
{|style="width:100%;vertical-align:top;margin-top:20px;" | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
− | |||
− | |||
<br> | <br> | ||
Line 136: | Line 135: | ||
</p> | </p> | ||
− | + | [[File:Bivfit.png|700px|center|frame|Bivariate fit of difficult words count. we select the SQRT transformation instead of the Ln transformation]] | |
− | [[File:Bivfit.png|700px|center]] | ||
{|style="width:100%;vertical-align:top;margin-top:20px;" | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
− | |||
− | |||
<p> | <p> | ||
Line 153: | Line 149: | ||
<p>We also ran bivariate fit against all the 18 numerical explanatory variables to test for multicollinearity. The figure below shows the bivariate correlation scatterplot.</p> | <p>We also ran bivariate fit against all the 18 numerical explanatory variables to test for multicollinearity. The figure below shows the bivariate correlation scatterplot.</p> | ||
− | [[File:Bivfitscattermatrix.png|700px|center]] | + | [[File:Bivfitscattermatrix.png|700px|center|frame|Bivariate correlation scatterplot matrix for all 18 numerical variables for the article model]] |
{|style="width:100%;vertical-align:top;margin-top:20px;" | {|style="width:100%;vertical-align:top;margin-top:20px;" | ||
− | |||
− | |||
<p>Using this scatterplot together with the bivariate correlation matrix, we eliminated 8 variables that are highly correlated. We ran Standard Least Squares regression on continuous numerical variables to verify the absence of multicollinearity in our remaining variables.</p> | <p>Using this scatterplot together with the bivariate correlation matrix, we eliminated 8 variables that are highly correlated. We ran Standard Least Squares regression on continuous numerical variables to verify the absence of multicollinearity in our remaining variables.</p> |
Revision as of 21:21, 23 April 2017
Click here to return to AY16/17 T2 Group List
Articles | Videos | R |
---|
Multiple Linear Regression Model What makes a good Facebook post? This section outlines the explanatory model on the article dataset from Facebook Insights supplemented with our crawled variables to form a holistic complete article dataset.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||