Social Media & Public Opinion - Project Overview

From Analytics Practicum
Revision as of 17:18, 22 January 2015 by Kean.kwok.2011 (talk | contribs)
Jump to navigation Jump to search

Home   HOME

 

Team   TEAM

 

Project Overview   PROJECT OVERVIEW

 

Project Progress   PROGRESS

 

Project Management   PROJECT MANAGEMENT

 

Documentation   DOCUMENTATION


Project Background & Description


Unstructured data is challenging and when it comes to unstructured textual data the analytical toolkit needs more tools! This project aims at quantifying and studying the trends in human emotions expressed by Twitter users over a period of time. The data-set provided comprises of social media data in form of tweets published by Singapore-based Twitter users over several months. We will come up with their granular analysis of change in mood trends, periods of significance (may be weekends or any weekday) and other noteworthy actionable insights coming out of analysis done. It is expected that the results are presented as a web-based visualization that summarizes the trend of the happiness level over time and allows the inspection of factors associated with the happiness level at a certain point in time. Also, it should provide the end-users with various drill-down features to choose from while interacting with the end system.


Motivation & Project Scope


The motivation behind this project is to be able to visually represent the data that Living Analytics Research Centre (LARC) has collected from twitter. The project aims to be able to create a dashboard that allows users to quickly view and understand what the data is telling them without delving into the data itself. The key scope to the project would be to create a dashboard that would distinctively represent the Twitter data to us. The Twitter data will consists of information that is provided Twitter, and on top of that, additional predicted user attributes. The focus is therefore to create a replicable and scalable dashboard that can accommodate the large amount of data collected by the LARC team.


High-Level Requirements


The system will include the following:

  • A timeline based on the tweets provided
  • The timeline will display the level of happiness as well as the volume of tweets.
  • Each point on the timeline will provide additional information like the overall happiness scores, the level of sentiments for each specific category etc.
  • Linked graphical representations based on the time line
  • Graphs to represent the aggregated user attributes (gender, age groups etc.)
  • Comparison between 2 different user defined time periods
  • Optional toggling of volume of tweets with sentiment timeline graph


Deliverables


  • Project Proposal
  • Mid-term presentation
  • Mid-term report
  • Final presentation
  • Final report
  • Project poster
  • A web-based platform hosted on OpenShift.


Limitations & Assumptions


Limitations Assumptions
Insufficient predicted information on the users (location, age etc.) Data given by LARC is sufficiently accurate for the user
Fake Twitter users LARC will determine whether or not the users are real or not
Ambiguity of the emotions Emotions given by the dictionary (as instructed by LARC) is conclusive for the Tweets that is provided
Dictionary words limited to the ones instructed by LARC A comprehensive study has been done to come up with the dictionary


ROI analysis


As part of LARC’s initiative to study the well-being of Singaporeans, this dashboard will be used as spring board to visually represent Singaporeans on the twitter space and identify the general sentiments of twitter users based on a given time period. This will be part of the smart city initiative by the Singapore government to understand the well-being of Singaporeans. This project may be a standalone or a series of projects done by LARC


Future extension


  • Scalable larger sets of data without hindering on time and performance
  • Able to accommodate real-time data to provide instantaneous analytics on-the-go


Acknowledgement & credit


  • Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter. PLoS ONE 6(12)
  • Companion website: http://hedonometer.org/
  • Schwartz HA, Eichstaedt J, Kern M, Dziurzynski L, Agrawal M, Park G, Lakshmikanth S, Jha S, Seligman M, Ungar L. (2013) Characterizing Geographic Variation in Well-Being Using Tweets. ICWSM, 2013
  • Helliwell J, Layard R, Sachs J (2013) World Happiness Report 2013. United Nations Sustainable Development Solutions Network.
  • Bollen J, Mao H, Zeng X (2010) Twitter mood predict the stock market. Journal of Computational Science 2(1)
  • http://www.happyplanetindex.org/