Difference between revisions of "ANLY482 AY2017-18T2 Group30 Instagram"

From Analytics Practicum
Jump to navigation Jump to search
Line 69: Line 69:
 
<div style=" width: 85%; padding:75px; font-family: Arimo; font-size: 14px; font-weight: bold; line-height: 1em;">
 
<div style=" width: 85%; padding:75px; font-family: Arimo; font-size: 14px; font-weight: bold; line-height: 1em;">
 
<font>
 
<font>
<!---------Enter Text Here ------->
+
After scraping the data, we realised that the data needed cleaning. The indexes of the column values were off as seen here:
 +
(image)
 +
We also concatenated the "tags" into a single column.
 
</font>
 
</font>
 
</div>
 
</div>

Revision as of 16:59, 7 February 2018

APex Logo.PNG


HOME ABOUT US PROJECT OVERVIEW PROJECT FINDINGS PROJECT MANAGEMENT DOCUMENTATION MAIN PAGE
Facebook Post Facebook Video Youtube Instagram Blog Post


Data Source

To retrieve data from the company's instagram, we made use of a web-scraping script from Github. We made modifications to the script to include timestamp as well as caption, the data includes:

  • Caption
  • Timestamp
  • Img URL
  • Tags
  • No. of Likes
  • No. of Comments
Data Preparation

After scraping the data, we realised that the data needed cleaning. The indexes of the column values were off as seen here: (image) We also concatenated the "tags" into a single column.

Exploratory Data Analysis

Final Application: Learning Dashboard