Difference between revisions of "Group04 Final"

From Analytics Practicum
Jump to navigation Jump to search
Line 73: Line 73:
  
 
==<div style="height: 15px; background: #f2f2f2; padding: 10px; font-weight: bold; line-height: 15px; text-indent: 5px; margin-bottom:; border-left: #d9d9d9 solid 5px; font-size: 15px; font-family: Arial;"><span style="color: #5F93D0;">Methods and Analysis</span></div>==
 
==<div style="height: 15px; background: #f2f2f2; padding: 10px; font-weight: bold; line-height: 15px; text-indent: 5px; margin-bottom:; border-left: #d9d9d9 solid 5px; font-size: 15px; font-family: Arial;"><span style="color: #5F93D0;">Methods and Analysis</span></div>==
aaa
 
  
 
===Facebook Posts===
 
===Facebook Posts===

Revision as of 14:05, 8 April 2018

GROUP4  
04HOMEPAGE.png HOMEPAGE   04OVERVIEW.png PROJECT OVERVIEW   04FINDINGS.png PROJECT FINDINGS   04PM.png PROJECT MANAGEMENT   04DOCUMENTATION.png DOCUMENTATION   04MAIN.png ANALY482 MAIN  
PROPOSAL INTERIM FINAL



Overview

In this section, we will be using nonparametric statistical tests and text analysis to understand factors that affect the performance of content. Having a clear understanding on the factors of content performance will enable the company to determine its future strategy to continuously strive for better performance.

We will explore posting times and content as factors of performance and seek an appropriate methodology to analyze their effects on content performance. To capture a wide range of audiences, the company is currently active on Facebook and YouTube. We will thus be looking at data scraped from Facebook and YouTube.

For the Facebook Post dataset, the performance of posts will be compared across posting time to determine if specific posting times will affect performance, while text analysis will be performed on consumers’ comments from the Facebook Comment dataset and YouTube dataset to identify if different surfaced topics will result in differing sentiments. After our literature review, we have chosen Topic Modeling and Sentiment Analysis as the preferred methodologies for text analysis. Also, the Median test will be used to compare performance across different posting times.

Facebook Posts

The company is concerned that publishing content on Facebook on different days and time will affect its content’s engagement performance. However, they have yet to establish a methodology to study the impact of publishing day and time on performance.

Studies have shown that identifying the optimal time to reach an audience will drive social media engagement and traffic. Due to the algorithm-based feed of Facebook, having a large audience does not necessarily translate to high viewership. Instead of viewership, reach is a metric used by Facebook to measure the number of people who has seen a particular content.

The main objective is to identify the most optimal time to publish a post that will result in the highest reach as it would drive engagement. However, we were unable to scrape this metric as it is not available publicly. We considered combining three performance metrics that were scraped (i.e. number of reactions, shares, and comments) into a single metric as a proxy for reach. However, this approach is infeasible as we have identified that each metric would accumulate data over a different length of time. While we have determined that comments are no longer made on posts over six days old, we do not have time-series data on reactions and shares to perform similar analysis and determine when the last reaction or share occurs after publication of a post.

Due to the limited scope of data scraped, we defined comments as a proxy to reach as a metric for performance. However, we acknowledge this to be a limitation, as comments are not a true representation of viewership.

Facebook Comments

For the Facebook comments, we will seek to understand how consumers perceive the respective Facebook posts. Popular topics identified within the comments and their sentiment scores will be explored using Latent Dirichlet Allocation (LDA) and Sentiment Analysis. According to literature review, sentiment analysis will allow us to identify positive and negative opinions and emotions, and will be performed using the TextBlob Python package. TextBlob was chosen, as prior research has used this Python package to perform sentiment analysis on social media. The objective is to identify possible insights and define actionable plans based on the Facebook Comment dataset.

Youtube

We will also analyze comments made on the YouTube videos through sentiment analysis, as this technique has been used for “analysis of user comments” from YouTube videos by other researchers. Performing sentiment analysis on YouTube comments is ideal as “mining the YouTube data makes more sense than any other social media websites as the contents here are closely related to the concerned topic [the video]”. We will perform sentiment analysis, using TextBlob, on scraped comments to understand the consumers’ sentiments, i.e. level of positivity, vis-a-vis the content of the published videos. These comments would be from videos published from 2017 onwards, ensuring that the analysis is still relevant given the fast-paced dynamic nature of YouTube channels.

Methods and Analysis

Facebook Posts

aaaa

Methodology

aaaa

Results

aaaa

Business Insights

aaaa

Facebook Comments

aaaa

Methodology

aaaa

Results

aaaa

Business Insights

aaaa

YouTube

aaaa

Methodology

aaaa

Results

aaaa

Business Insights

aaaa