Difference between revisions of "Group04 Report"

From ISSS608-Visual Analytics and Applications
Jump to navigation Jump to search
Line 86: Line 86:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
As shown in the graph, our data has 6927 results. Among them, 3389 results are from males and 3538 results are from females.
 +
Totally, the most people give their happiness scores as 4. Most people’s happiness levels are between 3 and 5. It is a really good situation.
 +
For all respondents, the age groups are between 17 and 105. Most people are between 28 years old and 94 years old. Namely, our data has nearly all age groups, so the data is very representative.
 
</td>
 
</td>
  
Line 99: Line 101:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
As shown in the graph, for the whole nation of China, the most people give health, depression and equity scores of 4, namely they think themselves are healthy, rarely feel depressed and think the society is more fair now. For the class, the most people think they are at the 5th level currently. For their socioeconomic status compared to peers, the most people give score of 2, namely think the status is almost the same. For the income ability, the most people give the score of 2, namely think their current income is reasonable. In a word, the general status of people’s life is nice in China.
 
</td>
 
</td>
 
<td width="40%">[[File:EDA 1.png|400px|center]]</td>
 
<td width="40%">[[File:EDA 1.png|400px|center]]</td>
Line 111: Line 113:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
As shown in the graph, happiness has weak correlation with every factor, because the scores are all logic scores. For this reason, we cannot judge which factors are more important. However, we can see happiness has positive relationship with equity, health, depression and class, and has negative relationship with peer status and income ability, which is the same as the situation in reality. We will do further research about the correlation between happiness and the other factors in the next part.
 
</td>
 
</td>
 
<td width="40%">[[File:MMA 5.png|400px|center]]</td>
 
<td width="40%">[[File:MMA 5.png|400px|center]]</td>
Line 123: Line 125:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
# As shown in the graph, for learning, the most people think they never learn (score of 1), and for relaxing and socializing, the most people give the scores between 2 to 5, namely there are many people rarely, sometimes or often relax or socialize. Compare learning and socializing, people seem have more time to relax. Besides, the more learning, relaxing and socializing people have, the happier they are. This trend is most obvious in relaxing.
 +
# As shown in the graph, we use bubble plot to see the correlation between happiness and the other factors, using the average scores of all factors in different regions. We also add three new factors: relaxing, socializing and learning.
 
</td>
 
</td>
 
<td width="40%">[[File:LB 1.png|400px|center]]</td>
 
<td width="40%">[[File:LB 1.png|400px|center]]</td>
Line 135: Line 138:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
As shown in the graph, people have higher happiness level in north China compared to the people in south China. Besides, the average happiness level of China is higher than 3.6, which is high, so the health level in China is nice. What’s more, the happiness level of Hebei is the highest in China, then NeiMongol, Jilin, Beijing and Shandong.
 
</td>
 
</td>
 
<td width="40%">[[File:Map 1.png|400px|center]]</td>
 
<td width="40%">[[File:Map 1.png|400px|center]]</td>
Line 147: Line 150:
 
<tr>
 
<tr>
 
<td width="30%">
 
<td width="30%">
 
+
As shown in the graph, we choose three clusters. According to the result, we consider the first cluster as developed regions, the second cluster as developing regions and the third cluster as less developed regions. In the developed regions, most situations of factors are good enough and in the developing factors, many situations of factors are not really good. Also as shown in the above maps, the regions in cluster 1 have the higher happiness level than the regions in cluster 2 and 3, and the regions in cluster 2 have the lowest happiness levels. It is reasonable that if a region is more developed, the happiness level there should be higher.
 
</td>
 
</td>
  

Revision as of 17:40, 26 April 2020

Happy face banner.jpg

CHINA HAPPINESS SURVEY

Proposal

Poster

Application

Report

User Guide

 



Abstract

Happiness is a very important topic all around the world nowadays, especially in China, the country which is becoming more and more focus on people’s happiness because of the development. Happiness can be influenced by a lot of factors, such as health, education and income. Also, happiness can show different features in different regions. Our objective is to get the most important factors to happiness in China. Therefore, we will use several types of graphs such as likert scale, bubble plot and mapping to analyze the result of a survey on Chinese people’s happiness in 2015 and consider different provinces in China.

Project Motivation

Our objective is to see the influence of different factors on happiness and get the most important factors. In order to get the factors, we will use several types of graphs, including exploratory data analysis, multivariate matrix analysis, likert scale, bubble plot, choropleth mapping and cluster analysis to analyze the happiness survey result in China in 2015 and show the results by different provinces. We will consider the happiness score given by the respondents themselves as the main target and choose several factors which we think have large influence on happiness to see the importance of them. Besides, when choosing the question results which can represent factors, we will choose subjective questions first, because they can directly show the real feelings of people.

R Packages Used

  • For Interactive Application: R Shiny and Shiny Dashboard

Shiny is an R Studio package for developing interactive charts, data visualizations and applications to be hosted on the web using the R programming language. It enables developer to make an interactive application which allow user to understand a certain model or do some data explorations. In this case, we could visualize the underlying rules beyond given datasets which show a clear picture of how those items correlate with each other. Package ‘shiny’Package ‘shinydashboard’

Data Cleaning and Preparation

Our data come from the result of <Chinese General Social Survey 2015 Annually Survey> done by Renmin University of China, which is basically about people’s life. It has a large number of questions about nearly all fields. Because there are too many questions and results, we only choose the ones we need to analyze. Except for the basic results such as the number of respondents, their gender and age, and the happiness scores, for the happiness factors, we mainly focus on health, depression, equity, class, peer status, income, relaxing, socializing and learning. All of them have subjective questions which can show people’s real status and logic scores. Pay attention, in the survey, the higher the scores of peer status and income ability are, the worse the situation is. In addition, Xinjiang, Xizang and Hainan does not have any data, so we will not consider them when making graphs.

Data field Original questionnaire number Question Remarks
id - ID -
province s41 Survey Location - Province/Autonomous Region/Municipality 1 = Shanghai; 2 = Yunnan Province; 3 = Inner Mongolia Autonomous Region; 4 = Beijing; 5 = Jilin Province; 6 = Sichuan Province; 7 = Tianjin; 8 = Ningxia Hui Autonomous Region; 9 = Anhui Province; 10 = Shandong Province ; 11 = Shanxi Province; 12 = Guangdong Province; 13 = Guangxi Zhuang Autonomous Region; 14 = Xinjiang Uygur Autonomous Region; 15 = Jiangsu Province; 16 = Jiangxi Province; 17 = Hebei Province; 18 = Henan Province; 19 = Zhejiang Province; 20 = Hainan Province; 21 = Hubei Province; 22 = Hunan Province; 23 = Gansu Province; 24 = Fujian Province; 25 = Tibet Autonomous Region; 26 = Guizhou Province; 27 = Liaoning Province; 28 = Chongqing City; 29 = Shaanxi Province; 30 = Qinghai Province; 31 = Heilongjiang Province;
gender a2 Gender 1 = Male; 2 = Female
birth a301 Birthday -
health a15 Do you feel your current physical health 1 = Unhealthy; 2 = Less healthy; 3 = Fair; 4 = healthy; 5 = Very Healthy;
depression a17 How often you feel depressed or depressed in the past four weeks 1 = Always; 2 = Often; 3 = Sometimes; 4 = Rarely; 5 = Never;
socialize a311 In the past year, do you often do the following in your free time-social 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always;
relax a312 In the past year, do you often do the following in your free time-relax 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always;
learn a313 In the past year, do you often do the following in your free time-study 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always;
equity a35 Generally speaking, do you think society is unfair today? 1 = Completely unfair; 2 = More unfair; 3 = Not fair but not unfair; 4 = More fair; 5 = Completely fair;
happiness a36 Overall, do you think your life is happy 1 = Very unhappy; 2 = Relatively unhappy; 3 = Not happy or unhappy; 4 = Relatively happy; 5 = Very happy;
class a431 At what level do you think you are currently 1 = 1(bottom); 10 = 10(top);
status_peer b1 What is your socioeconomic status compared to your peers 1 = Higher; 2 = Almost; 3 = Lower;
inc_ability b5 Considering your ability and working conditions, is your current income reasonable? 1 = Very reasonable; 2 = Reasonable; 3 = Unreasonable; 4 = Very unreasonable;

Choice of Visualizations and Critics

Application Design in Details

1.Bashboard

As shown in the graph, our data has 6927 results. Among them, 3389 results are from males and 3538 results are from females. Totally, the most people give their happiness scores as 4. Most people’s happiness levels are between 3 and 5. It is a really good situation. For all respondents, the age groups are between 17 and 105. Most people are between 28 years old and 94 years old. Namely, our data has nearly all age groups, so the data is very representative.

Dashboard 1.png
Dashboard 2.png
Dashboard 3.png

2.Exploratory Data Analysis

As shown in the graph, for the whole nation of China, the most people give health, depression and equity scores of 4, namely they think themselves are healthy, rarely feel depressed and think the society is more fair now. For the class, the most people think they are at the 5th level currently. For their socioeconomic status compared to peers, the most people give score of 2, namely think the status is almost the same. For the income ability, the most people give the score of 2, namely think their current income is reasonable. In a word, the general status of people’s life is nice in China.

EDA 1.png
EDA 2.png

3.Multivariate Matrix Analysis

As shown in the graph, happiness has weak correlation with every factor, because the scores are all logic scores. For this reason, we cannot judge which factors are more important. However, we can see happiness has positive relationship with equity, health, depression and class, and has negative relationship with peer status and income ability, which is the same as the situation in reality. We will do further research about the correlation between happiness and the other factors in the next part.

MMA 5.png
MMA 4.png

4.Likert & Bubble Plot

  1. As shown in the graph, for learning, the most people think they never learn (score of 1), and for relaxing and socializing, the most people give the scores between 2 to 5, namely there are many people rarely, sometimes or often relax or socialize. Compare learning and socializing, people seem have more time to relax. Besides, the more learning, relaxing and socializing people have, the happier they are. This trend is most obvious in relaxing.
  2. As shown in the graph, we use bubble plot to see the correlation between happiness and the other factors, using the average scores of all factors in different regions. We also add three new factors: relaxing, socializing and learning.
LB 1.png
LB 4.png
LB 3.png

5.Choropleth Mapping

As shown in the graph, people have higher happiness level in north China compared to the people in south China. Besides, the average happiness level of China is higher than 3.6, which is high, so the health level in China is nice. What’s more, the happiness level of Hebei is the highest in China, then NeiMongol, Jilin, Beijing and Shandong.

Map 1.png
Map 2.png

6.Cluster Analysis

As shown in the graph, we choose three clusters. According to the result, we consider the first cluster as developed regions, the second cluster as developing regions and the third cluster as less developed regions. In the developed regions, most situations of factors are good enough and in the developing factors, many situations of factors are not really good. Also as shown in the above maps, the regions in cluster 1 have the higher happiness level than the regions in cluster 2 and 3, and the regions in cluster 2 have the lowest happiness levels. It is reasonable that if a region is more developed, the happiness level there should be higher.

Heat 1.png
Heat 2.png

References