Difference between revisions of "G4 Report"

From ISSS608-Visual Analytics and Applications
Jump to navigation Jump to search
 
(4 intermediate revisions by the same user not shown)
Line 28: Line 28:
  
 
==Project Motivation==
 
==Project Motivation==
 +
<b><big>Association Rule Mining is Powerful</big></b><br>
  
 +
<b><big>Room for Improvement of Current Packages</big></b><br>
  
 
==R Packages Used==
 
==R Packages Used==
 +
* For Interactive Application: R Shiny and Shiny Dashboard
 +
Shiny is an R Studio package for developing interactive charts, data visualizations and applications to be hosted on the web using the R programming language. It enables developer to make an interactive application which allow user to understand a certain model or do some data explorations. In this case, we could visualize the underlying rules beyond given datasets which show a clear picture of how those items correlate with each other.  [https://cran.r-project.org/web/packages/shiny/shiny.pdf Package ‘shiny’][https://cran.r-project.org/web/packages/shinydashboard/shinydashboard.pdf Package ‘shinydashboard’]
  
 +
*For Interactive Plot: ggplot2, plotly and gghighlight [https://cran.r-project.org/web/packages/plotly/plotly.pdf Package ‘plotly’] [https://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf Package ‘ggplot2’] [https://cran.r-project.org/web/packages/gghighlight/vignettes/gghighlight.html Package ‘gghighlight’]
 +
 +
*For Choropleth Mapping: tmap, sf and leaflet [https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html Package ‘tmap’][https://rstudio.github.io/leaflet/ Package ‘leaflet’]
 +
 +
*For HeatMap: heatmaply [https://cran.r-project.org/web/packages/heatmaply/heatmaply.pdf Package ‘heatmaply’]
 +
 +
*For Likert Scale: likert [https://rdrr.io/cran/likert/man/shinyLikert.html Package ‘likert’]
 +
 +
*For Correlation Matrix: corrplot [https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html Package ‘corrplot’]
 +
 +
*For data preparation: sqldf(for SQL operations in R),dplyr,stringr [https://cran.r-project.org/web/packages/sqldf/sqldf.pdf Package ‘sqldf’] [https://cran.r-project.org/web/packages/dplyr/dplyr.pdf Package ‘dplyr’] [https://cran.r-project.org/web/packages/stringr/stringr.pdf Package ‘stringr’]
 +
 +
==Data Cleaning and Preparation==
 +
 +
<h4>Datasets</h4>
 +
*  happiness_train_abbr: lite version of survey dataset
 +
*  happiness_train_complete: full version of survey dataset
 +
*  happiness_index: index and description of survey dataset
 +
 +
<h4>Index and Description - happiness_train_abbr</h4>
 +
<div>
 +
 +
<table style="border: 1px solid #dddddd; font-size:10px;">
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee; padding: 2px; font-size:11px;"><th  width="12.5%"> Data field </th><th  width="7.5%"> Original questionnaire number </th><th  width="20%"> Question </th><th  width="60%"> Remarks </th></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> id </b></td><td> - </td><td> ID </td><td> - </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> province </b></td><td> s41 </td><td> Survey Location - Province/Autonomous Region/Municipality </td><td> 1 = Shanghai; 2 = Yunnan Province; 3 = Inner Mongolia Autonomous Region; 4 = Beijing; 5 = Jilin Province; 6 = Sichuan Province; 7 = Tianjin; 8 = Ningxia Hui Autonomous Region; 9 = Anhui Province; 10 = Shandong Province ; 11 = Shanxi Province; 12 = Guangdong Province; 13 = Guangxi Zhuang Autonomous Region; 14 = Xinjiang Uygur Autonomous Region; 15 = Jiangsu Province; 16 = Jiangxi Province; 17 = Hebei Province; 18 = Henan Province; 19 = Zhejiang Province; 20 = Hainan Province; 21 = Hubei Province; 22 = Hunan Province; 23 = Gansu Province; 24 = Fujian Province; 25 = Tibet Autonomous Region; 26 = Guizhou Province; 27 = Liaoning Province; 28 = Chongqing City; 29 = Shaanxi Province; 30 = Qinghai Province; 31 = Heilongjiang Province; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> gender </b></td><td> a2 </td><td> Gender </td><td> 1 = Male; 2 = Female </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> birth </b></td><td> a301 </td><td> Birthday </td><td> - </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> health </b></td><td> a15 </td><td> Do you feel your current physical health </td><td> 1 = Unhealthy; 2 = Less healthy; 3 = Fair; 4 = healthy; 5 = Very Healthy; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> depression </b></td><td> a17 </td><td> How often you feel depressed or depressed in the past four weeks </td><td> 1 = Always; 2 = Often; 3 = Sometimes; 4 = Rarely; 5 = Never; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> relax </b></td><td> a312 </td><td> In the past year, do you often do the following in your free time-relax </td><td> 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> learn </b></td><td> a313 </td><td> In the past year, do you often do the following in your free time-study </td><td> 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> equity </b></td><td> a35 </td><td> Generally speaking, do you think society is unfair today? </td><td> 1 = Completely unfair; 2 = More unfair; 3 = Not fair but not unfair; 4 = More fair; 5 = Completely fair; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> happiness </b></td><td> a36 </td><td> Overall, do you think your life is happy </td><td> 1 = Very unhappy; 2 = Relatively unhappy; 3 = Not happy or unhappy; 4 = Relatively happy; 5 = Very happy; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> class </b></td><td> a431 </td><td> At what level do you think you are currently </td><td> 1 = 1(bottom); 10 = 10(top); </td></tr>
 +
<tr  style="border: 1px solid #dddddd;padding: 2px;"><td><b> status_peer </b></td><td> b1 </td><td> What is your socioeconomic status compared to your peers </td><td> 1 = Higher; 2 = Almost; 3 = Lower; </td></tr>
 +
<tr  style="border: 1px solid #dddddd;background-color: #eeeeee;padding: 2px;"><td><b> inc_ability </b></td><td> b5 </td><td> Considering your ability and working conditions, is your current income reasonable? </td><td> 1 = Very reasonable; 2 = Reasonable; 3 = Unreasonable; 4 = Very unreasonable; </td></tr>
 +
</table>
 +
 +
</div>
  
 
==Choice of Visualizations and Critics==
 
==Choice of Visualizations and Critics==
Line 116: Line 160:
  
 
==References==
 
==References==
*[http://datastorm-open.github.io/visNetwork/  visNetwork, an R package for interactive network visualization]
+
*[https://worldhappiness.reportWorld Happiness Report 2020]
*[http://www.kdnuggets.com/2016/04/association-rules-apriori-algorithm-tutorial.html  Association Rules and the Apriori Algorithm: A Tutorial,Annalyn Ng, Ministry of Defence of Singapore]
 
*[http://www2.rdatamining.com/uploads/5/7/1/3/57136767/rdatamining-slides-association-rules.pdf  Association Rule Mining with R,Yanchang Zhao]
 
*[https://rpubs.com/sbushmanov/180410  Market basket analysis,S. Bushmanov]
 
*[https://www.interworks.com/blog/modonnell/2016/01/15/market-basket-analysis-using-r-and-shiny Market Basket Analysis Using R and Shiny, Maureen O'Donnell]
 
 
*[https://shiny.rstudio.com/articles/layout-guide.html  Shiny Application layout guide,JJ Allaire]
 
*[https://shiny.rstudio.com/articles/layout-guide.html  Shiny Application layout guide,JJ Allaire]
*[https://gallery.shinyapps.io/105-plot-interaction-zoom/  Zoomable plots in Shiny, RStudio, Inc.]
 
*[https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html Introduction to visNetwork,B. Thieurmel - DataStorm]
 
*[https://gist.github.com/timelyportfolio/762aa11bb5def57dc27f  Interactive arules with arulesViz and visNetwork,timelyportfolio]
 
*[http://kateto.net/network-visualization  Network visualization with R Workshop,Katya Ognyanova]
 
*[https://shiny.rstudio.com/articles/selecting-rows-of-data.html Selecting rows of data R Shiny]
 
*[http://mhahsler.github.io/arules/reference/write.html Write Transactions or Associations to a File]
 
 
*[https://shiny.rstudio.com/reference/shiny/latest/observeEvent.html Event handler R Shiny]
 
*[https://shiny.rstudio.com/reference/shiny/latest/observeEvent.html Event handler R Shiny]
*[https://shiny.rstudio.com/articles/action-buttons.html Using Action Buttons R Shiny]
+
*[https://https://www.freecodecamp.org/news/build-your-first-web-app-dashboard-using-shiny-and-r-ec433c9f3f6c/ Build your first web app dashboard using Shiny and R]
*[https://stackoverflow.com/questions/7531868/how-to-rename-a-single-column-in-a-data-frame Create an object for storing reactive values R Shiny]
+
*[https://db.rstudio.com/best-practices/dashboards/ Enterprise-ready dashboards]
 +
*[http://rstudio-pubs-static.s3.amazonaws.com/5312_98fc1aba2d5740dd849a5ab797cc2c8d.html  ggplot2 Reference and Examples (Part 2) - Colours]
 +
*[https://rpubs.com/tskam/likert Hands-On Exercise 8: Diverging Stacked Bar Chart, Dr. Kam Tin Seong]
 +
*[https://rpubs.com/tskam/heatmaps Hands-on Exercise 8: Heatmap Visualisation with R, Dr. Kam Tin Seong]
 +
*[https://rpubs.com/tskam/Corrgram Hands-On Exercise 8: Visualising Correlation Matrix, Dr. Kam Tin Seong]
 +
*[https://rpubs.com/tskam/Choropleth_Mapping Hands-on Exercise 10: Choropleth Mapping with R, Dr. Kam Tin Seong]
 +
*[https://rpubs.com/tskam/Proportional_Symbol_Map Hands-on Exercise 10: Mapping Geospatial Point Data with R, Dr. Kam Tin Seong]

Latest revision as of 16:50, 25 April 2020

Happy face banner.jpg

CHINA HAPPINESS SURVEY

Proposal

Poster

Application

Report

 



Project Motivation

Association Rule Mining is Powerful

Room for Improvement of Current Packages

R Packages Used

  • For Interactive Application: R Shiny and Shiny Dashboard

Shiny is an R Studio package for developing interactive charts, data visualizations and applications to be hosted on the web using the R programming language. It enables developer to make an interactive application which allow user to understand a certain model or do some data explorations. In this case, we could visualize the underlying rules beyond given datasets which show a clear picture of how those items correlate with each other. Package ‘shiny’Package ‘shinydashboard’

Data Cleaning and Preparation

Datasets

  • happiness_train_abbr: lite version of survey dataset
  • happiness_train_complete: full version of survey dataset
  • happiness_index: index and description of survey dataset

Index and Description - happiness_train_abbr

Data field Original questionnaire number Question Remarks
id - ID -
province s41 Survey Location - Province/Autonomous Region/Municipality 1 = Shanghai; 2 = Yunnan Province; 3 = Inner Mongolia Autonomous Region; 4 = Beijing; 5 = Jilin Province; 6 = Sichuan Province; 7 = Tianjin; 8 = Ningxia Hui Autonomous Region; 9 = Anhui Province; 10 = Shandong Province ; 11 = Shanxi Province; 12 = Guangdong Province; 13 = Guangxi Zhuang Autonomous Region; 14 = Xinjiang Uygur Autonomous Region; 15 = Jiangsu Province; 16 = Jiangxi Province; 17 = Hebei Province; 18 = Henan Province; 19 = Zhejiang Province; 20 = Hainan Province; 21 = Hubei Province; 22 = Hunan Province; 23 = Gansu Province; 24 = Fujian Province; 25 = Tibet Autonomous Region; 26 = Guizhou Province; 27 = Liaoning Province; 28 = Chongqing City; 29 = Shaanxi Province; 30 = Qinghai Province; 31 = Heilongjiang Province;
gender a2 Gender 1 = Male; 2 = Female
birth a301 Birthday -
health a15 Do you feel your current physical health 1 = Unhealthy; 2 = Less healthy; 3 = Fair; 4 = healthy; 5 = Very Healthy;
depression a17 How often you feel depressed or depressed in the past four weeks 1 = Always; 2 = Often; 3 = Sometimes; 4 = Rarely; 5 = Never;
relax a312 In the past year, do you often do the following in your free time-relax 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always;
learn a313 In the past year, do you often do the following in your free time-study 1 = Never; 2 = Rarely; 3 = Sometimes; 4 = Often; 5 = Always;
equity a35 Generally speaking, do you think society is unfair today? 1 = Completely unfair; 2 = More unfair; 3 = Not fair but not unfair; 4 = More fair; 5 = Completely fair;
happiness a36 Overall, do you think your life is happy 1 = Very unhappy; 2 = Relatively unhappy; 3 = Not happy or unhappy; 4 = Relatively happy; 5 = Very happy;
class a431 At what level do you think you are currently 1 = 1(bottom); 10 = 10(top);
status_peer b1 What is your socioeconomic status compared to your peers 1 = Higher; 2 = Almost; 3 = Lower;
inc_ability b5 Considering your ability and working conditions, is your current income reasonable? 1 = Very reasonable; 2 = Reasonable; 3 = Unreasonable; 4 = Very unreasonable;

Choice of Visualizations and Critics

Application Design in Details

Use Cases

1.Bashboard

Dashboard 1.png
Dashboard 2.png
Dashboard 3.png

2.Exploratory Data Analysis

EDA 1.png
EDA 2.png

3.Multivariate Matrix Analysis

MMA 1.png
MMA 3.png

4.Likert & Bubble Plot

LB 1.png
LB 2.png
LB 3.png

5.Choropleth Mapping

Map 1.png
Map 2.png

6.Cluster Analysis

Heat 1.png
Heat 2.png

References