Difference between revisions of "ISSS608 2016 17T1 Group4 Report"
| Klkuar.2016 (talk | contribs) | Klkuar.2016 (talk | contribs)  | ||
| Line 27: | Line 27: | ||
| == Word Cloud == | == Word Cloud == | ||
| − | Word Cloud is famously described as “the mullets of the Internet“. However, it can be useful when trying to reveal repeated  | + | Word Cloud is famously described as “the mullets of the Internet“. However, it can be useful when trying to reveal repeated themes or words. Furthermore, it is engaging and the results can be understood quickly. | 
| = Demonstration = | = Demonstration = | ||
Revision as of 19:42, 27 November 2016
Contents
Motivation of the application
There is huge amount of data in the social networks in the present times. More communication and expression of views are happening within the network than ever before. These networks define an important societal element. Twitter is one such prominent directed network, where information can be posted as tweets; short but influential messages. When key events occur, this is probably the first place one gets to see the buzz, the trends, status and the direction. For stakeholders, this information is key.
Tweets are voluminous and text rich. Mining this information, defining the business use case and the insights required is a challenge. The US 2016 Elections took the attention of the world during this time. This project analyses tweets for top US elections #tags. On the day of 2016 US presidential election, Twitter proved to be the largest source of breaking news with 40 million tweets.
Objective
Using Visual Analytics techniques and applications, we want to answer these question.
- What were the key ideas in the tweets, namely #tags that were most mentioned?
- To which users, @mentions in the twitter network, did these #tags relate to? In the given tweets, compare and analyse the association between the #tags and the @mentions.
Review and critic on past works
There are a number of free Twitter analytics and visualization tools available online. In particular interest of analyzing social networks, they provide options such as analyzing one's Twitter network, visualizing the followers on a map, statistics on user mentions, communication between users. However, availability of a interactive custom application to understand the association between #tags and @mentions is uncommon. It is more important to see the connections between #tags, which represent all elements within the subject of interest. When there is a key political event for instance, the influences on the social networks can cause high impact to change directions. It will be of great value to identify prominent users or influencers towards whom key issues are directed towards. An interactive visual application with elements such as network graphs, can best help see this.
Data
Downloaded 19K tweets from www.followthehashtag.com for popular US elections 2016 hashtags. Apart from the actual tweet content, the data had other attributes such as, associated #tag, @mentions, frequency of retweets, media mentions and location.
With some study, the key attributes for Visual analysis were identified.
- #tags
- @mentions
- tweet content
Design framework
Word Cloud
Word Cloud is famously described as “the mullets of the Internet“. However, it can be useful when trying to reveal repeated themes or words. Furthermore, it is engaging and the results can be understood quickly.
Demonstration
We use the US Election tweets data as an example for our demonstration.
Word Cloud
How to use the application:
1. Upload text file using the Text File upload widget at the sidebar. Note: The maximum file size supported is 5MB and must be a text file.
2. Customise image to be displayed using filters e.g. maximum number of words, minimum frequency and rotation. In this case, we set Minimum Frequency: 50, Maximum Words: 100, Rotation: 0.35
3. The default setting is to have "Document Stemming" and "Repeatable" checked. To remove, simply uncheck the box.
- "Document stemming" is to reduce the words to their root form. E.g. 'elections' become 'election'. Frequencies of 'elections' and 'election' are added together for the word cloud generation.
- "Repeatable" allows changes made to the word cloud via the filters to be added/removed from the initial plot generated by R.
4. Download the word cloud image and its frequency table by clicking on the 'Download Image" and "Download Frequency Table" buttons respectively. The downloaded image is in PNG format and table is in CSV format.
Insights:
The size of the word reflects its frequency in the file. For example, the output #USElection has higher occurrence than #globalwarming.
Discussion
What has the audience learned from your work? What new insights or practices has your system enabled? A full blown user study is not expected, but informal observations of use that help evaluate your system are encouraged.
Future Work
- Font colours of Word Cloud could be of one single colour to minimise distractions.
- Intensity of the font colour could complement the font size, which represents the frequency of the word


