Difference between revisions of "ANLY482 AY2017-18T2 Group30 Data Analysis"

From Analytics Practicum
Jump to navigation Jump to search
(Edited data source for Facebook post)
Line 52: Line 52:
  
 
<div align="left">
 
<div align="left">
<div style=" width: 85%; padding:30px; font-family: Arimo; font-size: 14px; line-height: 1em; text-indent: 30px;">
+
<div style=" width: 90%; padding:10px; font-family: Arimo; font-size: 14px; line-height: 1em; text-indent: 30px;">
 
<p>
 
<p>
*For data files from <i>Facebook Insights Data Export (Post Level)</i>, the sponsor provided exported data from different periods of the year, with different metric tabs in Excel format. The tabs included are:  
+
::*For data files from <i>Facebook Insights Data Export (Post Level)</i>, the sponsor provided exported data from different periods of the year, with different metric tabs in Excel format. The tabs included are:  
 
#Key Metrics  
 
#Key Metrics  
 
#Lifetime: Number of unique people who have created a story about your Page post by interacting with it (unique users)
 
#Lifetime: Number of unique people who have created a story about your Page post by interacting with it (unique users)
Line 60: Line 60:
 
#Lifetime: Number of people who have given negative feedback on your post, by type (unique users)
 
#Lifetime: Number of people who have given negative feedback on your post, by type (unique users)
  
*Facebook Post data comprises of 4 main types (a total of 1381 rows):  
+
::*Facebook Post data comprises of 4 main types (a total of 1381 rows):  
 
#Link (955 rows)  
 
#Link (955 rows)  
 
#Photo (56 rows)  
 
#Photo (56 rows)  
Line 74: Line 74:
  
 
<div align="left">
 
<div align="left">
<div style=" width: 85%; padding:75px; font-family: Arimo; font-size: 14px; line-height: 1em; text-indent: 15px;">
+
<div style="width: 90%;padding:10px; font-family: Arimo; font-size: 14px; line-height: 1em; text-indent: 30px;">
<font>
+
<p>
To help us have an overview of the data throughout the year, we consolidated the various tabs, whilst concatenating the various periods of data for the same columns, into one combined file. This was carried out using the software, IBM JMP Pro, in the following steps:  
+
::To help us have an overview of the data throughout the year, we consolidated the various tabs, whilst concatenating the various periods of data for the same columns, into one combined file. This was carried out using the software, IBM JMP Pro, in the following steps:
* With Post ID, Permalink (permanent link of the campaign content), Post Message, Type, Countries and Posted columns as key identifiers among the different tabs for the excel files, we appended desired columns from the other tabs to the end of the Key Metrics. They included the Share, Like, Comment columns from Tab 2; Other Clicks, Link Clicks, Photo View, Video Play columns from Tab 3; Hide_Clicks , Hide_all_clicks, Unlike_page_clicks, report_spam_clicks columns from Tab 4. <br>This was conducted using the <i>Tables > Join </i>function, with “Matching Specification” as the key identifiers and “Output Columns” of the appended desired columns.  
+
::* With Post ID, Permalink (permanent link of the campaign content), Post Message, Type, Countries and Posted columns as key identifiers among the different tabs for the excel files, we appended desired columns from the other tabs to the end of the Key Metrics. They included the Share, Like, Comment columns from Tab 2; Other Clicks, Link Clicks, Photo View, Video Play columns from Tab 3; Hide_Clicks , Hide_all_clicks, Unlike_page_clicks, report_spam_clicks columns from Tab 4. <br>This was conducted using the <i>Tables > Join </i>function, with “Matching Specification” as the key identifiers and “Output Columns” of the appended desired columns.  
 
+
::* Next, for each period of data files (appended with new columns) from multiple tabs, we concatenate the data across different time periods to have a full year collection of data.<br>This was conducted using the <i>Tables > Concatenate </i> function, while adding multiple data tables into “Data Tables to be Concatenated”.  
* Next, for each period of data files (appended with new columns) from multiple tabs, we concatenate the data across different time periods to have a full year collection of data.<br>This was conducted using the <i>Tables > Concatenate </i> function, while adding multiple data tables into “Data Tables to be Concatenated”.  
+
::* Finally, we check for <b>missing data</b> in the different columns. For example, under the column Type, we have five different types, namely: Link, Photo, Shared Video, Status and Video. However, in the instances of missing data, we will cross check with the permalink of the campaign post, and check the Type of medium was posted and fill it in accordingly.
 
+
::*Using JMP Pro, we can see that there is a particular post that has garnered higher than usual number of shares versus the lifetime post total reach. We will classify it as an outlier in our analysis. The outlier has higher than normal values with 400,000 total reach.  
* Finally, we check for missing data in the different columns. For example, under the column Type, we have five different types, namely: Link, Photo, Shared Video, Status and Video. However, in the instances of missing data, we will cross check with the permalink of the campaign post, and check the Type of medium was posted and fill it in accordingly.  
+
</p>
</font>
+
<center>
 +
[[Image:Share_vs_Lifetime_Post_Total_Reach.PNG|650px|Identification of outlier from all types of Facebook posts]]
 +
</center>
 
</div>
 
</div>
 
</div>
 
</div>

Revision as of 16:05, 9 February 2018

APex Logo.PNG


HOME ABOUT US PROJECT OVERVIEW PROJECT FINDINGS PROJECT MANAGEMENT DOCUMENTATION MAIN PAGE
Facebook Post Facebook Video Youtube Instagram Blog Post


Data Source

  • For data files from Facebook Insights Data Export (Post Level), the sponsor provided exported data from different periods of the year, with different metric tabs in Excel format. The tabs included are:
  1. Key Metrics
  2. Lifetime: Number of unique people who have created a story about your Page post by interacting with it (unique users)
  3. Lifetime: Number of people who have clicked anywhere in your post, by type (unique users)
  4. Lifetime: Number of people who have given negative feedback on your post, by type (unique users)
  • Facebook Post data comprises of 4 main types (a total of 1381 rows):
  1. Link (955 rows)
  2. Photo (56 rows)
  3. Shared Video (103 rows)
  4. Video (267 rows)

Data Preparation

To help us have an overview of the data throughout the year, we consolidated the various tabs, whilst concatenating the various periods of data for the same columns, into one combined file. This was carried out using the software, IBM JMP Pro, in the following steps:
  • With Post ID, Permalink (permanent link of the campaign content), Post Message, Type, Countries and Posted columns as key identifiers among the different tabs for the excel files, we appended desired columns from the other tabs to the end of the Key Metrics. They included the Share, Like, Comment columns from Tab 2; Other Clicks, Link Clicks, Photo View, Video Play columns from Tab 3; Hide_Clicks , Hide_all_clicks, Unlike_page_clicks, report_spam_clicks columns from Tab 4.
    This was conducted using the Tables > Join function, with “Matching Specification” as the key identifiers and “Output Columns” of the appended desired columns.
  • Next, for each period of data files (appended with new columns) from multiple tabs, we concatenate the data across different time periods to have a full year collection of data.
    This was conducted using the Tables > Concatenate function, while adding multiple data tables into “Data Tables to be Concatenated”.
  • Finally, we check for missing data in the different columns. For example, under the column Type, we have five different types, namely: Link, Photo, Shared Video, Status and Video. However, in the instances of missing data, we will cross check with the permalink of the campaign post, and check the Type of medium was posted and fill it in accordingly.
  • Using JMP Pro, we can see that there is a particular post that has garnered higher than usual number of shares versus the lifetime post total reach. We will classify it as an outlier in our analysis. The outlier has higher than normal values with 400,000 total reach.

Identification of outlier from all types of Facebook posts

Exploratory Data Analysis

Final Application: Learning Dashboard