Difference between revisions of "AY1516 T2 Team AP Analysis PostInterimPlan"
Line 84: | Line 84: | ||
<td width="30%" border="1" align="center">10206930900524483;1042647259126948;10204920589409318; ...</td> | <td width="30%" border="1" align="center">10206930900524483;1042647259126948;10204920589409318; ...</td> | ||
<td width="60%" align="center">10153979571077290;955321504523847;1701864973403904; ...</td></tr> | <td width="60%" align="center">10153979571077290;955321504523847;1701864973403904; ...</td></tr> | ||
− | + | ''Each user ID in List of Likers and List of Commenters are separated by a semicolon, and tagged to each post.'' | |
</table> | </table> | ||
− | |||
After crawling the Facebook API for ~4.5 Hours, the result is 1600++ posts dating 10 Months ago, with a CSV file size of ~38MB. | After crawling the Facebook API for ~4.5 Hours, the result is 1600++ posts dating 10 Months ago, with a CSV file size of ~38MB. | ||
[[File:Screen Shot 2016-04-09 at 5.22.10 pm.png|thumbnail|''Code snippet of likers & commenters retrieval'']] | [[File:Screen Shot 2016-04-09 at 5.22.10 pm.png|thumbnail|''Code snippet of likers & commenters retrieval'']] | ||
+ | [[File:Screen Shot 2016-04-09 at 5.34.50 pm.png|thumbnail|''Code snippet of conversion of CSV into GraphML format'']] | ||
+ | |||
+ | Subsequently, we wanted to visualize the data using the Gephi tool. Hence, additional python code was used to read the CSV file, programmatically reading each row of the CSV, and attaching each post ID to likers and commenters respectively. This is done so that we can construct the .graphml graph formatted file, which gephi is able to read. | ||
+ | |||
+ | The resultant file is uploaded [https://drive.google.com/a/smu.edu.sg/file/d/0B4ESKidr4zkIbnhPTWlnSU5sUms/view?usp=sharing here] for reference |
Revision as of 17:41, 9 April 2016
Data Retrieval & Manipulation | Findings | Post interim plan |
---|
Facebook Graph API (Post Interim Plan)
Apart from analysing one of SGAG's popular social network Twitter, we plan to leverage the Facebook Graph API. Drawing from our experience using the twitter API, we are looking to crawl Facebook data in a similar fashion, crawling, retrieving and aggregating post-level Facebook data. Hopefully, this process can yield conclusive results about the SGAG's social network (likes, shares, etc) on Facebook.
Approach (Post Interim Plan)
Step | Expected Result | Notes |
---|---|---|
1 | Collect all post data |
|
2 | All user objects for each like, for every post |
|
3 | "Comment-Level" per post and number of shares on a "user-level" |
|
Data Retrieval
Constructing the graph from scratch involved the usage of python code to retrieve posts from SGAG's Facebook account for posts dating back 10 months. This involved connecting to the Facebook graph API programatically to formulate a csv file that resembles this structure:
Each user ID in List of Likers and List of Commenters are separated by a semicolon, and tagged to each post.
Post ID | List of Likers | List of Commenters |
---|---|---|
378167172198277_1187053787976274 | 10206930900524483;1042647259126948;10204920589409318; ... | 10153979571077290;955321504523847;1701864973403904; ... |
After crawling the Facebook API for ~4.5 Hours, the result is 1600++ posts dating 10 Months ago, with a CSV file size of ~38MB.
Subsequently, we wanted to visualize the data using the Gephi tool. Hence, additional python code was used to read the CSV file, programmatically reading each row of the CSV, and attaching each post ID to likers and commenters respectively. This is done so that we can construct the .graphml graph formatted file, which gephi is able to read.
The resultant file is uploaded here for reference