Difference between revisions of "AY1516 T2 Team AP Methodology"

From Analytics Practicum
Jump to navigation Jump to search
Line 75: Line 75:
 
|-
 
|-
 
|  
 
|  
Engagement rate
+
Engagement Rate
 
||  
 
||  
 
A consolidated figure to illustrate how many people who see a particular tweet eventually interact with it (out of the total number of people who saw the tweet), in the following ways/forms:
 
A consolidated figure to illustrate how many people who see a particular tweet eventually interact with it (out of the total number of people who saw the tweet), in the following ways/forms:
Line 90: Line 90:
 
|-
 
|-
 
|  
 
|  
Bounce rate <br>
+
Tweet Text
''(Percentage of sessions that starting with the page (out of all the other tracked skyscanner pages) where the reader leaves after visiting the page (i.e. one page views))
 
''
 
<br><br>
 
Exit % <br>
 
''(Percentage of sessions involving the page where the reader leaves after reading the page)
 
''
 
 
||
 
||
 +
Although the effectiveness of jokes can be tough to evaluate from a linguistics perspective, our initial approach would be cross referencing the hashtags used in the tweet with Google Trends data (Searches & Events)
  
Readers arriving at Skyscanner’s news pages are expected to be browsing for information related to a particular destination or related travel content. Since Skyscanner articles are light (bit-sized) reads, we would expect readers to continue browsing other relevant articles via the recommendation engine or the outbound links within the articles themselves.Nevertheless, there will bound to be a point where readers finally exit the site. Hence, we are expecting to see an average bounce rate and exit% rating across the articles. Articles with particularly high ratings would serve as good negative-subjects of study for future reference.
 
 
 
|-
 
|
 
Average time on page
 
||
 
Time spent on a page is expected to be indicative of interest levels in an article and possibly the number of unique page views. It would be interesting to validate if time spent is a predictor of unique page views. If so, we could also consider study articles with long average times to identify good articles.
 
 
|}
 
|}
  
<p>Understanding key dependent variables which influence the value of the unique page views will help in the creation of content which have greater tendency of receiving higher page views.</p>
+
<p>Giving a perspective on the important key variables that affects the popularity of a tweet will aid in the formulation of content that have higher penchant of being a popular tweet</p>
  
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Google Trends Analysis</strong></font></div></div>==
 
==<div style="background: #232AE8; line-height: 0.3em; font-family:helvetica;  border-left: #6C7A89 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Google Trends Analysis</strong></font></div></div>==

Revision as of 03:52, 17 January 2016

HOME

OVERVIEW

ANALYSIS

PROJECT MANAGEMENT

DOCUMENTATION

Project Description Data Methodology


Overview

In the table below we outline the algorithms/techniques that we intend to execute for a particular objective.

Objective Analytical Method(s)
Network analysis via Degree centrality, Betweenness centrality
  • Social Network Analysis
  • Cluster Analysis
Plan what to publish based on characterisation of audience
  • Multivariable regression
  • Cross reference of google trends data and content of tweet

Multivariable Regression on tweet content vs Google Trends

With reference to trending topics on a particular day of a tweet, multivariate regression will be performed to relate trending topics to the popularity of a tweet (retweet, likes, etc).

The key variables that we intend to explore are elaborated in the table below:

Variable Importance

Retweets

This measure shows how many times a particular tweet is being shared by followers. We think this is interesting because it highlights the willingness of an individual to share the tweet, increasing the probability that the tweet was interesting.

Url clicks

This measure shows how many times users actually click on the shortened link shared within a tweet. Given the succinct nature of a tweet, users who click on outgoing links are likely to find the tweet more interesting than other tweets, since clicking on the link would mean interrupting the "flowing" nature while reading the Twitter feed.

Likes

Compared to Url clicks and Retweets, this measure is the mildest, indicating that the user probably found the tweet interesting, but wasn't compelling enough to share.

Engagement Rate

A consolidated figure to illustrate how many people who see a particular tweet eventually interact with it (out of the total number of people who saw the tweet), in the following ways/forms:

  • Link clicks
  • Favourites
  • Retweets
  • Replies
  • Embedded media clicks
  • Detail expands
  • Shared via email
  • Permalink clicks
  • User profile clicks
  • Follows

Tweet Text

Although the effectiveness of jokes can be tough to evaluate from a linguistics perspective, our initial approach would be cross referencing the hashtags used in the tweet with Google Trends data (Searches & Events)

Giving a perspective on the important key variables that affects the popularity of a tweet will aid in the formulation of content that have higher penchant of being a popular tweet

Google Trends Analysis

In planning the content for the upcoming quarter, the content management team typically uses Google Trends to understand consumer trends in both past similar quarters as well as the present. They would also also consider the present context of festivities and events.

Content Themes Analysis

Skyscanner has identified 7 content themes articles typically belong to. Operating on a lean workforce, it would be helpful to be able to identify which of the 7 content themes reaps the greatest yield. Here, we define yield by the metrics Google analytics tracks. They are the number of unique page views, bounce rate and exit %, as well as the average time spent on page. This will be done via Text Miner by SAS.

Text Miner can generate a number of topics. Each topic will be associated with a set of representative keywords derived from the corpus of articles input to the algorithm. Each article would have a probability rating of belonging to a particular topic. We would tag the topic with the highest probability rating to the article. We would then manually examine the keywords representative of the topic, then classify the topics according to the 7 content themes. Having classified the articles into the 7 content themes, we can now analyse them with the google analytics metrics, thereby identifying popular content themes as an area of focus.

Data Visualization

Unique Page Views Exploration

Heat Map of Traffic Source (Country Specific New Page)