AY1516 T2 Team AP Analysis PostInterimTwitterFindings
Data Retrieval & Manipulation (Pre Interim) | Pre interim findings | Post interim twitter findings | Post interim plan | Post interim findings |
---|
Contents
High Level Twitter Data
Top 10 retweets of SGAG by date, with themes identified
Retweet and Favorite count by Category: Top 5 Categories
A simple chart of the most popular tweets of all SGAG tweets.
Retweet Behaviour
Chart depicting users retweet behaviour throughout the day. This shows that users are more likely to tweet during the late morning to early afternoon, indicating more activity during lunch/tea time.
Impressions Behaviour
Replies Behaviour
Twitter Social Network – Post to User Engagement
After conducting descriptive analysis on the high-level data extracted from SGAG’s Twitter account, we decided to create Gephi visualisations of SGAG’s Twitter social network. Using posts and followers as nodes, and each follower’s engagement (Retweet) as edges, this was the output generated.
Initial Gephi Output: Sample of 52 Twitter posts with more than 500 Retweets, Due to Twitter API limitations, we could only obtain 60 followers maximum per post.
Nodes: 5339 (52 post nodes, 5287 users)
Edges: 7290
Based on the Gephi layout algorithms that emphasize complementaries, our team looked into 3 layouts to allow our visualisation to better resemble a social network: Force Atlas 2, YiFan Hu, Circular Layout. This serves to bring linked nodes together and push non-linked nodes apart to obtain a readable representation.
Force Atlas 2:
Frushterman-Reingold:
YiFan Hu Layout:
For Frushterman-Reingold’s output, the algorithm puts the highly centralised nodes at the centre of the graph, while other less centralised nodes are spread around. This didn’t allow us to clearly distinguish between shared Follower nodes and the visualisation is less insightful compared to the output produced by the other 2 algorithms.
We found similarities between Force Atlas 2 and YiFan Hu’s layout algorithms, as high degree nodes were clearly distinguishable and low degree nodes surrounded these nodes. However, we found that Force Atlas 2’s output was not very intuitive for users to clearly understand SGAG’s Twitter Network, and the edges between posts were less distinguishable compared to YiFan Hu’s output.
For YiFan Hu’s layout algorithm, the force-directed algorithm puts highly centralised nodes at the centre while keeping smaller nodes at the edges. It provides a neat output that groups up individual nodes that have a degree of 1, grouping up the Follower nodes, which makes it easier to observe the shared Follower Nodes between Posts. Our final decision was to go with YiFan Hu’s algorithm as our chosen Gephi layout.
Findings
Follower maximum degree: 24
Follower Minimum degree: 1
Post maximum degree: 583 (Food)
Post minimum degree: 50 (Holiday)
Modularity
Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules.
Conducted using the Louvain method:
- Look for communities by optimizing modularity locally
- Aggregate nodes of the same community and build a new network where nodes are the communities.
- Repeat iteratively until a maximum modularity is attained
Modularity: 0.704 (scale of -1 to 1)
Number of Communities: 33
Number of Topics: 15
Finding: For this sample of highly retweeted posts (>500 retweets), SGAG’s follower segments are not completely homogenous and do not share many common topics among themselves. We know there is no one size-fits-all for content.
Diameter: 6
Longest path from one follower to another follower is 2 users away. It shows that the furthest a follower has to go to meet another follower is 2 users.