ISSS608 2016 17T1 Group4 Report

From Visual Analytics and Applications
Jump to navigation Jump to search

Motivation of the application

There is huge amount of data in the social networks in the present times. More communication and expression of views are happening within the network than ever before. These networks define an important societal element. Twitter is one such prominent directed network, where information can be posted as tweets; short but influential messages. When key events occur, this is probably the first place one gets to see the buzz, the trends, status and the direction. For stakeholders, this information is key.

Tweets are voluminous and text rich. Mining this information, defining the business use case and the insights required is a challenge. The US 2016 Elections took the attention of the world during this time. This project analyses tweets for top US elections #tags. On the day of 2016 US presidential election, Twitter proved to be the largest source of breaking news with 40 million tweets.

Objective

Using Visual Analytics techniques and applications, we want to answer these question.

  • What were the key ideas in the tweets, namely #tags that were most mentioned?
  • To which users, @mentions in the twitter network, did these #tags relate to? In the given tweets, compare and analyse the association between the #tags and the @mentions.


Review and critic on past works

There are a number of free Twitter analytics and visualization tools available online. In particular interest of analyzing social networks, they provide options such as analyzing one's Twitter network, visualizing the followers on a map, statistics on user mentions, communication between users. However, availability of a interactive custom application to understand the association between #tags and @mentions is uncommon. It is more important to see the connections between #tags, which represent all elements within the subject of interest. When there is a key political event for instance, the influences on the social networks can cause high impact to change directions. It will be of great value to identify prominent users or influencers towards whom key issues are directed towards. An interactive visual application with elements such as network graphs, can best help see this.

Data

Downloaded 19K tweets from www.followthehashtag.com for popular US elections 2016 hashtags. Apart from the actual tweet content, the data had other attributes such as, associated #tag, @mentions, frequency of retweets, media mentions and location.

With some study, the key attributes for Visual analysis were identified.

  • #tags
  • @mentions
  • tweet content

Data prep.png

Design framework

A detail description of the design principles used and data visualisation elements built.


Demonstration / Use Case Scenario

We use the US Election tweets data as an example for our demonstration.

Word Cloud

How to use the application:

1. Upload text file using the Text File upload widget at the sidebar. Note: The maximum file size supported is 5MB and must be a text file.

2. Customise image to be displayed using filters e.g. maximum number of words, minimum frequency and rotation. In this case, we set Minimum Frequency: 50, Maximum Words: 100, Rotation: 0.35

3. The default setting is to have "Document Stemming" and "Repeatable" checked. To remove, simply uncheck the box.

  • "Document stemming" is to reduce the words to their root form. E.g. 'elections' become 'election'. Frequencies of 'elections' and 'election' are added together for the word cloud generation.
  • "Repeatable" allows changes made to the word cloud via the filters to be added/removed from the initial plot generated by R.

4. Download the word cloud image and its frequency table by clicking on the 'Download Image" and "Download Frequency Table" buttons respectively. The downloaded image is in PNG format and table is in CSV format.


Insights:

The size of the word reflects its frequency in the file. For example, the output #USElection has higher occurrence than #globalwarming.

Discussion

What has the audience learned from your work? What new insights or practices has your system enabled? A full blown user study is not expected, but informal observations of use that help evaluate your system are encouraged.


Future Work

A description of how your system could be extended or refined.

References

Twitter Viz tools

Stack Overflow - tm custom removePunctuation except hashtag

TrigonaMinima - Word Cloud