Difference between revisions of "Social Media & Public Opinion - Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
Line 2: Line 2:
 
{|style="background-color:#0084b4; color:#F5F5F5; padding: 5 0 5 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#0084b4; color:#F5F5F5; padding: 5 0 5 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
| style="padding:0.3em; font-family:Segoe UI; font-size:120%; background-color:#0084b4; border-bottom:2px solid #3f3f3f; text-align:center; color:#F5F5F5" width="8%" |  
 
| style="padding:0.3em; font-family:Segoe UI; font-size:120%; background-color:#0084b4; border-bottom:2px solid #3f3f3f; text-align:center; color:#F5F5F5" width="8%" |  
[[Image:SMPO-Home.png|25px|link=Social_Media_%26_Public_Opinion|Home]]  
+
[[Image:SMPO-Home.png|25px|link=Social Media & Public Opinion|Home]]  
[[Social_Media_%26_Public_Opinion|<font color="#F5F5F5" size=2><b>HOME</b></font>]]
+
[[Social Media & Public Opinion|<font color="#F5F5F5" size=2><b>HOME</b></font>]]
  
 
| style="border-bottom:2px solid #3f3f3f; background:none;" width="1%" | &nbsp;  
 
| style="border-bottom:2px solid #3f3f3f; background:none;" width="1%" | &nbsp;  
Line 34: Line 34:
 
<!--Content Start-->
 
<!--Content Start-->
  
Being able to identify what make people happy is arguably one of the most important parts of socio-economic development. Increasingly, many public-opinion polls and government agencies have asked citizens the questions related to happiness and wellbeing in their surveys.
+
<div align="left">
Following a recent trend in happiness studies, the goal of this project is to apply the use of social media to measure happiness as a less expensive (in terms of time and resources) method to traditional surveys. We will focus on quantifying happiness over time and characterizing what factors associate with the happiness level at a specific time point. For example, can it be observed that people tend to be happier on the weekends and holidays than the weekdays? Do younger people tend to be less happy than older people? Etc.
+
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Project Background & Description</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
  
  
 +
Unstructured data is challenging and when it comes to unstructured textual data the analytical toolkit needs more tools! This project aims at quantifying and studying the trends in human emotions expressed by Twitter users over a period of time. The data-set provided comprises of social media data in form of tweets published by Singapore-based Twitter users over several months. We will come up with their granular analysis of change in mood trends, periods of significance (may be weekends or any weekday) and other noteworthy actionable insights coming out of analysis done. It is expected that the results are presented as a web-based visualization that summarizes the trend of the happiness level over time and allows the inspection of factors associated with the happiness level at a certain point in time. Also, it should provide the end-users with various drill-down features to choose from while interacting with the end system.
 +
</div>
  
  
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Motivation & Project Scope</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
  
  
 +
The motivation behind this project is to be able to visually represent the data that Living Analytics Research Centre (LARC) has collected from twitter. The project aims to be able to create a dashboard that allows users to quickly view and understand what the data is telling them without delving into the data itself. The key scope to the project would be to create a dashboard that would distinctively represent the Twitter data to us. The Twitter data will consists of information that is provided Twitter, and on top of that, additional predicted user attributes. The focus is therefore to create a replicable and scalable dashboard that can accommodate the large amount of data collected by the LARC team.
 +
</div>
  
  
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">High-Level Requirements</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
  
  
 +
The system will include the following:
 +
* A timeline based on the tweets provided
 +
* The timeline will display the level of happiness as well as the volume of tweets.
 +
* Each point on the timeline will provide additional information like the overall happiness scores, the level of sentiments for each specific category etc.
 +
* Linked graphical representations based on the time line
 +
* Graphs to represent the aggregated user attributes (gender, age groups etc.)
 +
* Comparison between 2 different user defined time periods
 +
* Optional toggling of volume of tweets with sentiment timeline graph
 +
</div>
  
  
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Deliverables</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
  
  
 +
* Project Proposal
 +
* Mid-term presentation
 +
* Mid-term report
 +
* Final presentation
 +
* Final report
 +
* Project poster
 +
* A web-based platform hosted on OpenShift.
 +
</div>
  
  
<font color="#606060" size=20><center> In development </center></font>
+
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Limitations & Assumptions</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
 +
 
 +
 
 +
{| class="wikitable" style="margin-left: 10px;"
 +
|-! style="background: #0084b4; color: white; text-align: center;" colspan= "2"
 +
| width="50%" | '''Limitations'''
 +
| width="50%" | '''Assumptions'''
 +
|-
 +
| Insufficient predicted information on the users (location, age etc.)
 +
| Data given by LARC is sufficiently accurate for the user
 +
|-
 +
| Fake Twitter users
 +
| LARC will determine whether or not the users are real or not
 +
|-
 +
| Ambiguity of the emotions
 +
| Emotions given by the dictionary (as instructed by LARC) is conclusive for the Tweets that is provided
 +
|-
 +
| Dictionary words limited to the ones instructed by LARC
 +
| A comprehensive study has been done to come up with the dictionary
 +
|}
 +
</div>
 +
 
 +
 
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">ROI analysis</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
 +
 
 +
 
 +
As part of LARC’s initiative to study the well-being of Singaporeans, this dashboard will be used as spring board to visually represent Singaporeans on the twitter space and identify the general sentiments of twitter users based on a given time period. This will be part of the smart city initiative by the Singapore government to understand the well-being of Singaporeans. This project may be a standalone or a series of projects done by LARC
 +
</div>
 +
 
 +
 
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-family:Segoe UI; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Future extension</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
 +
 
 +
 
 +
* Scalable larger sets of data without hindering on time and performance
 +
* Able to accommodate real-time data to provide instantaneous  analytics on-the-go
 +
</div>
 +
 
 +
 
 +
<div align="left">
 +
<div style="background: #c0deed; padding: 15px; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #0084b4 solid 32px;"><font color="black">Acknowledgement & credit</font></div>
 +
<div style="border-left: #EAEAEA solid 12px; padding: 0px 30px 0px 18px; ">
 +
 
 +
 
 +
* Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter. PLoS ONE 6(12)
 +
* Companion website: http://hedonometer.org/
 +
* Schwartz HA, Eichstaedt J, Kern M, Dziurzynski L, Agrawal M, Park G, Lakshmikanth S, Jha S, Seligman M, Ungar L. (2013) Characterizing Geographic Variation in Well-Being Using Tweets. ICWSM, 2013
 +
* Helliwell J, Layard R, Sachs J (2013) World Happiness Report 2013. United Nations Sustainable Development Solutions Network.
 +
* Bollen J, Mao H, Zeng X (2010) Twitter mood predict the stock market. Journal of Computational  Science 2(1)
 +
* http://www.happyplanetindex.org/
 +
 
 +
</div>

Revision as of 14:11, 22 January 2015

Home   HOME

 

Team   TEAM

 

Project Overview   PROJECT OVERVIEW

 

Project Progress   PROGRESS

 

Project Management   PROJECT MANAGEMENT

 

Documentation   DOCUMENTATION


Project Background & Description


Unstructured data is challenging and when it comes to unstructured textual data the analytical toolkit needs more tools! This project aims at quantifying and studying the trends in human emotions expressed by Twitter users over a period of time. The data-set provided comprises of social media data in form of tweets published by Singapore-based Twitter users over several months. We will come up with their granular analysis of change in mood trends, periods of significance (may be weekends or any weekday) and other noteworthy actionable insights coming out of analysis done. It is expected that the results are presented as a web-based visualization that summarizes the trend of the happiness level over time and allows the inspection of factors associated with the happiness level at a certain point in time. Also, it should provide the end-users with various drill-down features to choose from while interacting with the end system.


Motivation & Project Scope


The motivation behind this project is to be able to visually represent the data that Living Analytics Research Centre (LARC) has collected from twitter. The project aims to be able to create a dashboard that allows users to quickly view and understand what the data is telling them without delving into the data itself. The key scope to the project would be to create a dashboard that would distinctively represent the Twitter data to us. The Twitter data will consists of information that is provided Twitter, and on top of that, additional predicted user attributes. The focus is therefore to create a replicable and scalable dashboard that can accommodate the large amount of data collected by the LARC team.


High-Level Requirements


The system will include the following:

  • A timeline based on the tweets provided
  • The timeline will display the level of happiness as well as the volume of tweets.
  • Each point on the timeline will provide additional information like the overall happiness scores, the level of sentiments for each specific category etc.
  • Linked graphical representations based on the time line
  • Graphs to represent the aggregated user attributes (gender, age groups etc.)
  • Comparison between 2 different user defined time periods
  • Optional toggling of volume of tweets with sentiment timeline graph


Deliverables


  • Project Proposal
  • Mid-term presentation
  • Mid-term report
  • Final presentation
  • Final report
  • Project poster
  • A web-based platform hosted on OpenShift.


Limitations & Assumptions


Limitations Assumptions
Insufficient predicted information on the users (location, age etc.) Data given by LARC is sufficiently accurate for the user
Fake Twitter users LARC will determine whether or not the users are real or not
Ambiguity of the emotions Emotions given by the dictionary (as instructed by LARC) is conclusive for the Tweets that is provided
Dictionary words limited to the ones instructed by LARC A comprehensive study has been done to come up with the dictionary


ROI analysis


As part of LARC’s initiative to study the well-being of Singaporeans, this dashboard will be used as spring board to visually represent Singaporeans on the twitter space and identify the general sentiments of twitter users based on a given time period. This will be part of the smart city initiative by the Singapore government to understand the well-being of Singaporeans. This project may be a standalone or a series of projects done by LARC


Future extension


  • Scalable larger sets of data without hindering on time and performance
  • Able to accommodate real-time data to provide instantaneous analytics on-the-go


Acknowledgement & credit


  • Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter. PLoS ONE 6(12)
  • Companion website: http://hedonometer.org/
  • Schwartz HA, Eichstaedt J, Kern M, Dziurzynski L, Agrawal M, Park G, Lakshmikanth S, Jha S, Seligman M, Ungar L. (2013) Characterizing Geographic Variation in Well-Being Using Tweets. ICWSM, 2013
  • Helliwell J, Layard R, Sachs J (2013) World Happiness Report 2013. United Nations Sustainable Development Solutions Network.
  • Bollen J, Mao H, Zeng X (2010) Twitter mood predict the stock market. Journal of Computational Science 2(1)
  • http://www.happyplanetindex.org/