Difference between revisions of "AY1516 T2 Team AP Methodology"

From Analytics Practicum
Jump to navigation Jump to search
 
(18 intermediate revisions by 3 users not shown)
Line 2: Line 2:
 
{|style="background-color:#ffffff; color:#ffff; padding: 5 0 5 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border-left= "1" solid #ffffff; border-right:1px solid #ffffff; |
 
{|style="background-color:#ffffff; color:#ffff; padding: 5 0 5 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border-left= "1" solid #ffffff; border-right:1px solid #ffffff; |
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[Image:Team_ap_home_white.png|16px]]
 
[[AY1516 T2 Team AP|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>HOME</b></font>]]
 
[[AY1516 T2 Team AP|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>HOME</b></font>]]
  
 
| style="padding:0.1em; font-size:100%; background-color:#000000; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.1em; font-size:100%; background-color:#000000; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[Image:Team_ap_overview_white.png|16px]]
 
[[AY1516 T2 Team AP_Overview|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>OVERVIEW</b></font>]]
 
[[AY1516 T2 Team AP_Overview|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>OVERVIEW</b></font>]]
  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[Image:Team_ap_analysis_white.png|16px]]
 
[[AY1516 T2 Team AP_Analysis|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>ANALYSIS</b></font>]]
 
[[AY1516 T2 Team AP_Analysis|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>ANALYSIS</b></font>]]
  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[Image:Team_ap_project_management_white.png|16px]]
 
[[AY1516 T2 Team AP_Project_Management|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>PROJECT MANAGEMENT</b></font>]]
 
[[AY1516 T2 Team AP_Project_Management|<font color="#F5F5F5" size=2.5 face="Century Gothic"><b>PROJECT MANAGEMENT</b></font>]]
  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.1em; font-size:100%; background-color:#1AEB9E; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[Image:Team_ap_documentation_white.png|16px]]
 
[[AY1516 T2 Team AP_Documentation| <font color="#F5F5F5" size=2.5 face="Century Gothic"><b>DOCUMENTATION</b></font>]]
 
[[AY1516 T2 Team AP_Documentation| <font color="#F5F5F5" size=2.5 face="Century Gothic"><b>DOCUMENTATION</b></font>]]
 
|}  
 
|}  
Line 36: Line 41:
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Overview</strong></font></div></div>==
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Overview</strong></font></div></div>==
  
<p>In the table below we outline the algorithms/techniques that we intend to execute.</p>
+
<p>In the table below we outline the algorithms/techniques that we intend to execute for a particular objective.</p>
 
{| class="wikitable" width="50%"
 
{| class="wikitable" width="50%"
 
|-
 
|-
! width="60%" | Objective !! Analytical Method(s)
+
! width="60%" | Objective !! Analytical Approach
 
|-
 
|-
 
| Network analysis via Degree centrality, Betweenness centrality  ||  
 
| Network analysis via Degree centrality, Betweenness centrality  ||  
Line 45: Line 50:
 
* Cluster Analysis
 
* Cluster Analysis
 
|-
 
|-
| Facilitate the content planning process by way of an interactive dashboard
+
| Plan what to publish based on characterisation of audience
 
||  
 
||  
* Data Visualization
+
*Categorisation of posts
* Multiple Linear Regression on Article Characteristics
+
*Analysis of follower interactions with SGAG posts
* Content Themes Analysis
 
 
|}
 
|}
  
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Multiple Linear Regression on Article Characteristics</strong></font></div></div>==
+
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Targeted Content</strong></font></div></div>==
 
 
<p>Based on the merged dataset comprising of attributes from Google Analytics and article attributes scraped directly from the new articles, we will be performing multiple linear regression (MLR) to determine key attributes affecting the number of unique page views. </p>
 
<p>We will be exploring the following dependent variables in predicting the number of unique page views:
 
</p>
 
  
 +
Content created at SGAG is tailored for Singaporeans, and revolve around the milestones commonly encountered at different ages. For example, the typical 18 year old male Singaporean faces the prospect of enlistment into Basic Military Training (BMT), and would experience a mixture of emotions. SGAG takes milestone events like these makes humorous content on it. Below are the targeted age groups for SGAG, with some of the associated commonly met milestones:
 +
 
 +
<!--------------- Body End ---------------------->
 
{| class="wikitable" width="70%"
 
{| class="wikitable" width="70%"
 
|-
 
|-
! Independent Variable  !! Intuition for Selection
+
! Age Group !! Milestone Content Topics
|-
 
|
 
No. of words (stopwords removed)
 
|| This measure serves as an indicator of the length of the article. Recognising that readers have a limited attention span, it would be interesting to explore the effect of a lengthy article on its popularity.
 
 
|-
 
|-
 
|  
 
|  
No. of outbound links references
+
18 - 21
 
||  
 
||  
Outbound links typically direct readers to more in-depth content. An article with more links might be indicative of more meaningful content, which might translate to greater popularity and better reception amongst its readers. 
+
* Male: National Service (Basic Military Training), Relationship issues
 +
* Female: Entry to University, Student Exchange Programme, Relationship issues, Social Night
 
|-
 
|-
 
|  
 
|  
No. of images <br>
+
22 - 25
No. of videos
 
 
||  
 
||  
Images and videos make for a more interactive experience with the reader. It might be an important determinant in an article’s receptivity.
+
* Male: ORD (End of National Service), Entry to University, Relationship issues, Social Night
 +
* Female: Graduation from University, First Job, Colleagues
 
|-
 
|-
 
|  
 
|  
No. of article shares
+
26 - 34
 
||  
 
||  
The intuition is that people share articles that are useful and impactful. Number of article shares is expected to have a positive correlation with the number of unique page views. It would be of interest to assess its importance, hence making an assessment of the importance of social media as a platform of publicity in comparison to other platforms.
+
* Male: Graduation from University, First Job, Colleagues
|-
+
* Female: Family, Having Kids
|
 
Bounce rate <br>
 
''(Percentage of sessions that starting with the page (out of all the other tracked skyscanner pages) where the reader leaves after visiting the page (i.e. one page views))
 
''
 
<br><br>
 
Exit % <br>
 
''(Percentage of sessions involving the page where the reader leaves after reading the page)
 
''
 
||
 
  
Readers arriving at Skyscanner’s news pages are expected to be browsing for information related to a particular destination or related travel content. Since Skyscanner articles are light (bit-sized) reads, we would expect readers to continue browsing other relevant articles via the recommendation engine or the outbound links within the articles themselves.Nevertheless, there will bound to be a point where readers finally exit the site. Hence, we are expecting to see an average bounce rate and exit% rating across the articles. Articles with particularly high ratings would serve as good negative-subjects of study for future reference.
+
|}
  
 +
Content creation is also based on events that happen in Singapore. These are categorized into 2 types, expected and unexpected. Expected events include mainstream events like the National Day Parade, while unexpected events include train breakdowns. A more comprehensive list is given below: 
  
 +
{| class="wikitable" width="70%"
 +
|-
 +
! Event type !! Event Content Topics
 +
|-
 +
|
 +
Expected
 +
||
 +
National Day Parade, SG50, SEA Games, Elections
 
|-
 
|-
 
|  
 
|  
Average time on page
+
Unexpected
 
||  
 
||  
Time spent on a page is expected to be indicative of interest levels in an article and possibly the number of unique page views. It would be interesting to validate if time spent is a predictor of unique page views. If so, we could also consider study articles with long average times to identify good articles.
+
Train breakdowns, different takes on Minister comments, Traffic accidents
 +
 
 
|}
 
|}
  
<p>Understanding key dependent variables which influence the value of the unique page views will help in the creation of content which have greater tendency of receiving higher page views.</p>
+
By understanding the content consumption habits of SGAG's social media audiences through further analysis, SGAG will be able to better craft content publishing strategies to increase consumer base.
 
 
 
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Google Trends Analysis</strong></font></div></div>==
 
  
<p>In planning the content for the upcoming quarter, the content management team typically uses Google Trends to understand consumer trends in both past similar quarters as well as the present. They would also also consider the present context of festivities and events.
+
For additional in-depth information, do peruse our wiki tabs at
</p>
 
  
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Content Themes Analysis</strong></font></div></div>==
+
* [https://wiki.smu.edu.sg/ANLY482/AY1516_T2_Team_AP_Analysis_PostInterimTwitterFindings Post-Interim Twitter Findings]
 
+
* [https://wiki.smu.edu.sg/ANLY482/AY1516_T2_Team_AP_Analysis_PostInterimFindings Post-Interim Facebook Findings]
<p>Skyscanner has identified 7 content themes articles typically belong to. Operating on a lean workforce, it would be helpful to be able to identify which of the 7 content themes reaps the greatest yield. Here, we define yield by the metrics Google analytics tracks. They are the number of  unique page views, bounce rate and exit %, as well as the average time spent on page. This will be done via Text Miner by SAS. </p>
 
 
 
<p>Text Miner can generate a number of topics. Each topic will be associated with a set of representative keywords derived from the corpus of articles input to the algorithm. Each article would have a probability rating of belonging to a particular topic. We would tag the topic with the highest probability rating to the article. We would then manually examine the keywords representative of the topic, then classify the topics according to the 7 content themes. Having classified the articles into the 7 content themes, we can now analyse them with the google analytics metrics, thereby identifying popular content themes as an area of focus.</p>
 
 
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Data Visualization</strong></font></div></div>==
 
 
 
=== Unique Page Views Exploration ===
 
 
 
=== Heat Map of Traffic Source (Country Specific New Page) ===
 
 
 
<!--------------- Body End ---------------------->
 

Latest revision as of 22:24, 17 April 2016

Team ap home white.png HOME

Team ap overview white.png OVERVIEW

Team ap analysis white.png ANALYSIS

Team ap project management white.png PROJECT MANAGEMENT

Team ap documentation white.png DOCUMENTATION

Project Description Data Methodology


Overview

In the table below we outline the algorithms/techniques that we intend to execute for a particular objective.

Objective Analytical Approach
Network analysis via Degree centrality, Betweenness centrality
  • Social Network Analysis
  • Cluster Analysis
Plan what to publish based on characterisation of audience
  • Categorisation of posts
  • Analysis of follower interactions with SGAG posts

Targeted Content

Content created at SGAG is tailored for Singaporeans, and revolve around the milestones commonly encountered at different ages. For example, the typical 18 year old male Singaporean faces the prospect of enlistment into Basic Military Training (BMT), and would experience a mixture of emotions. SGAG takes milestone events like these makes humorous content on it. Below are the targeted age groups for SGAG, with some of the associated commonly met milestones:

Age Group Milestone Content Topics

18 - 21

  • Male: National Service (Basic Military Training), Relationship issues
  • Female: Entry to University, Student Exchange Programme, Relationship issues, Social Night

22 - 25

  • Male: ORD (End of National Service), Entry to University, Relationship issues, Social Night
  • Female: Graduation from University, First Job, Colleagues

26 - 34

  • Male: Graduation from University, First Job, Colleagues
  • Female: Family, Having Kids

Content creation is also based on events that happen in Singapore. These are categorized into 2 types, expected and unexpected. Expected events include mainstream events like the National Day Parade, while unexpected events include train breakdowns. A more comprehensive list is given below:

Event type Event Content Topics

Expected

National Day Parade, SG50, SEA Games, Elections

Unexpected

Train breakdowns, different takes on Minister comments, Traffic accidents

By understanding the content consumption habits of SGAG's social media audiences through further analysis, SGAG will be able to better craft content publishing strategies to increase consumer base.

For additional in-depth information, do peruse our wiki tabs at