AY1516 T2 Team AP Analysis PostInterimTwitterFindings

From Analytics Practicum
Jump to navigation Jump to search

Team ap home white.png HOME

Team ap overview white.png OVERVIEW

Team ap analysis white.png ANALYSIS

Team ap project management white.png PROJECT MANAGEMENT

Team ap documentation white.png DOCUMENTATION


Data Retrieval & Manipulation (Pre Interim) Pre interim findings Post interim twitter findings Post interim plan Post interim findings

High Level Twitter Data

Top 10 retweets of SGAG by date, with themes identified Twitter high level insights retweets.png

Retweet and Favorite count by Category: Top 5 Categories

TotalpostRatio1.png

A simple chart of the most popular tweets of all SGAG tweets.


Retweet Behaviour

Twitterhighlvl2.png

Chart depicting users retweet behaviour throughout the day. This shows that users are more likely to tweet during the late morning to early afternoon, indicating more activity during lunch/tea time.

Impressions Behaviour

Twitterhighlvl3.png

Replies Behaviour

Twitterhighlvl4.png

Twitter Social Network – Post to User Engagement

After conducting descriptive analysis on the high-level data extracted from SGAG’s Twitter account, we decided to create Gephi visualisations of SGAG’s Twitter social network. Using posts and followers as nodes, and each follower’s engagement (Retweet) as edges, this was the output generated.

Initial Gephi Output: Sample of 52 Twitter posts with more than 500 Retweets, Due to Twitter API limitations, we could only obtain 60 followers maximum per post.

Nodes: 5339 (52 post nodes, 5287 users)

Edges: 7290

Twitternetwork1.png

Based on the Gephi layout algorithms that emphasize complementaries, our team looked into 3 layouts to allow our visualisation to better resemble a social network: Force Atlas 2, YiFan Hu, Circular Layout. This serves to bring linked nodes together and push non-linked nodes apart to obtain a readable representation.

Force Atlas 2:

Twitternetwork2.png Legend1.png

Frushterman-Reingold:

Twitternetwork3.png Legend2.png

YiFan Hu Layout:

Twitternetwork4.png Legend3.png

For Frushterman-Reingold’s output, the algorithm puts the highly centralised nodes at the centre of the graph, while other less centralised nodes are spread around. This didn’t allow us to clearly distinguish between shared Follower nodes and the visualisation is less insightful compared to the output produced by the other 2 algorithms.

We found similarities between Force Atlas 2 and YiFan Hu’s layout algorithms, as high degree nodes were clearly distinguishable and low degree nodes surrounded these nodes. However, we found that Force Atlas 2’s output was not very intuitive for users to clearly understand SGAG’s Twitter Network, and the edges between posts were less distinguishable compared to YiFan Hu’s output.

For YiFan Hu’s layout algorithm, the force-directed algorithm puts highly centralised nodes at the centre while keeping smaller nodes at the edges. It provides a neat output that groups up individual nodes that have a degree of 1, grouping up the Follower nodes, which makes it easier to observe the shared Follower Nodes between Posts. Our final decision was to go with YiFan Hu’s algorithm as our chosen Gephi layout.

Findings

Follower maximum degree: 24

Follower Minimum degree: 1

Post maximum degree: 583 (Food)

Post minimum degree: 50 (Holiday)

Modularity

Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules.

Conducted using the Louvain method: 

  1. Look for communities by optimizing modularity locally
  2. Aggregate nodes of the same community and build a new network where nodes are the communities.
  3. Repeat iteratively until a maximum modularity is attained

Modularity: 0.704 (scale of -1 to 1)

Number of Communities: 33

Number of Topics: 15

Finding: For this sample of highly retweeted posts (>500 retweets), SGAG’s follower segments are not completely homogenous and do not share many common topics among themselves. We know there is no one size-fits-all for content.

Twitternetwork5.png

Diameter: 6

Longest path from one follower to another follower is 2 users away. It shows that the furthest a follower has to go to meet another follower is 2 users.

Twitternetwork6.png