AY1516 T2 Team AP Analysis Findings

From Analytics Practicum
Jump to navigation Jump to search

Team ap home white.png HOME

Team ap overview white.png OVERVIEW

Team ap analysis white.png ANALYSIS

Team ap project management white.png PROJECT MANAGEMENT

Team ap documentation white.png DOCUMENTATION


Data Retrieval & Manipulation (Pre Interim) Pre interim findings Post interim twitter findings Post interim plan Post interim findings

Data Exploration at Follower Level

Understanding Retweeters' preferences
Retweet frequency by category chart.PNG

We looked the combined data that we have gotten - each follower of SGAG, with a sampled dataset of the number of retweets that have for SGAG posts, grouped according to category. It is interesting to see that certain categories stand out and are much likely preferred than others. In particular, the categories of tweet of content most likely to be retweeted in order of highest to lowest preference seem to be School, National Events, and Politics, with 16.9%, 15.5% and 14.3% of the total number of retweets respectively. Our initial intuition is that since these are the more popular kinds of content that Twitters read through, and thus in effect want to share it with others through retweets, SGAG can steer themselves towards creating more of these content.

We further evaluate these 3 top categories identified to be retweeted the most, and that it is probably due to 2 reasons: firstly, general uptake and viral tweets that are generally retweeted by the general population of followers of SGAG, or secondly, certain groups of retweeters that are more active in retweeting. In other words, these are the influencers of the network, and if accurately identified, will be critical to SGAG's growth and should be SGAG's primary group of twitter users to target for maximal effect. Even within this group of influencers, each Twitter user probably has a preference for each category of retweet. We hope to find a possible profile of users that tend to select and retweet specific categories of content.

Distribution of retweets.PNG

Following this, we observed that the majority of the retweets fall under the range of 0-10 retweets per SGAG follower, with a small number of the SGAG followers doing most of the retweeting work. Interestingly, data analysis shows that the mode number of retweets amongst SGAG's follower community is 0. Most users can be said to be passive users, either consuming content but not sharing any in the form of retweets, or inactive users.

The population of interest in our project are those who are more active retweeters, as well as their choice of categories. Based on the graph distribution graph illustrated above, there is a small community of these influencers.

Data Exploration at Post Level

Retweeet and fav.png

For each of SGAG's posts, we charted how many times each post was retweeted, and favourited. With regards to this dataset, we wanted to get on a post level, who were the specific retweeters for each post. However, each post on average, had on average about 750 retweeters, and we decided to focus on posts that had more than 500 retweeters. As shown in the bar chart above, the number of retweeters that retweet categories that are related to school and politics had the greatest number of retweeters. This indicated that posts related to school and politics garnered the most number of retweets, which happens to be topics that many Singaporeans can relate to, like GE 2015, latest trends in school, etc.

For the number of times each post was favorited, the same categories are favorited many more times than others. Politics, schools and goodwill were categories that were favorited the most, echoing the trend that users who favorite if they think that their posts were relatable to their friends/followers of their own accounts