Difference between revisions of "AY1617 T5 Team AP Findings"

From Analytics Practicum
Jump to navigation Jump to search
m (Edit characters)
(edit tier)
Line 28: Line 28:
  
 
<font face ="Impact" color= #566573 size="4" >Tier</font><br>
 
<font face ="Impact" color= #566573 size="4" >Tier</font><br>
[[Image:EDA_TIER.PNG|380px|]]
+
[[Image:EDA_TIER.PNG|380px|Fig 4]]
[[Image:EDA_TIER_1.PNG|620px|]]<br>
+
[[Image:EDA_TIER_1.PNG|620px|Fig 5]]<br>
 +
Referring to Fig 4, the tier with the highest viewership is tier C. Tier B, C and D shows similar view time attrition rate of around 50% derived from the ratio of lifetime unique views to lifetime unique 30 seconds view. Tier A shows the highest view time attrition rate. This could be due to the type of videos which are categorised in this category.
  
 
<font face ="Impact" color= #566573 size="4" >Genre</font><br>
 
<font face ="Impact" color= #566573 size="4" >Genre</font><br>

Revision as of 22:14, 21 February 2017

SGAG HOME INACTIVE.PNG
SGAG OVERVIEW INACTIVE.PNG
SGAG MET INACTIVE.PNG
SGAG PM INACTIVE.PNG







SGAG FINDINGS ACTIVE.PNG
SGAG DOC INACTIVE.PNG
SGAG AU INACTIVE.PNG
SGAG LOGO.PNG







SGAG MT ACTIVE.PNG
SGAG FINALS.PNG



EXPLORATORY DATA ANALYSIS

Character

Fig 1


Fig 2 Fig 3
With reference to Fig 2 and Fig 3, other than Character D, the rest of the characters garnered higher median views when they appear in the videos compared to those which they do not appear in. However, looking at Fig 1, Character D actually has the most video appearance among the 4 characters. This makes it a strange phenomenon that we intend to investigate further into our analysis. If Character D appearance is statistically proven to decrease a video’s performance, SGAG should consider reducing the amount of appearances he makes.


Tier
Fig 4 Fig 5
Referring to Fig 4, the tier with the highest viewership is tier C. Tier B, C and D shows similar view time attrition rate of around 50% derived from the ratio of lifetime unique views to lifetime unique 30 seconds view. Tier A shows the highest view time attrition rate. This could be due to the type of videos which are categorised in this category.

Genre

EDA GENRE.PNG


EDA GENRE 1.PNG


EDA GENRE 2.PNG


EDA GENRE 3.PNG


Quality

EDA QUALITY.PNG


EDA QUALITY 1.PNG EDA QUALITY 2.PNG

Sponsored?
EDA SPONSORED.PNG EDA SPONSORED 1.PNG

KPI

Our client has identified 4 key performance indicators (KPI) for the videos - the number of unique views, number of likes, number of shares and number of comments. Due to the huge difference in the range that these KPIs fall under, data transformation has to be done to normalize them. We have adopted the Johnson Su transformation for all 4 of the variables to follow a normal distribution.

MULTIVARIATE ANALYSIS

EDA MULTIV.PNG

Referring to the results of a multivariate analysis above, all of the variables are highly correlated to one another. Therefore, we have decided to adopt the Principal Component Analysis (PCA) method, which uses an octagonal transformation to convert our 4 correlated variables into a set of values of linearly uncorrelated variables, which are our principal components. PCA allows us to extract patterns that were previously not obvious before the analysis.

EDA EIGEN.PNG

Looking at the Eigenvalues of our PCA, since the PRIN-1 is able to yield close to 93%, we would be using that for the rest of our analysis. Eigenvalues show how much each principal component accounts for in terms of the percentage of the aggregate performance variation.

PRINCIPAL COMPONENT ANALYSIS (PCA)

Further Analysis
From the PCA results of all our existing factors (Character, Genre etc.) , we were able to find out which values under the different factors (Character A, B, C etc.) were contributing to the videos' performance, and whether it was a positive or negative impact. Due to our small data set, there were instances where the values of a factor were unable to be distinguish as statistically different. Thus, in order to tackle that problem, we are planning to either carry out nonparametric analysis to distinguish them, or simulate more data points using the profiler function using our current set of data.

Character

PCA CHARACTER.PNG


Tier

PCA TIER.PNG


Genre

PCA GENRE.PNG


Quality

PCA QUALITY.PNG


Sponsored?

PCA SPONSORED.PNG