Difference between revisions of "Group 3 Overview"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{|style="background-color:#000066; color:#000066 padding: 5px 0 0 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#000066; color:#000066 padding: 5px 0 0 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group3_Proposal| <font color="#ffffff"><b>PROPOSAL</b></font>]]  
+
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group_3_Proposal| <font color="#ffffff"><b>PROPOSAL</b></font>]]  
  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group3__Poster|<font color="#ffffff" size=2><b>POSTER</b></font>]]
+
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group_3_Poster|<font color="#ffffff" size=2><b>POSTER</b></font>]]
  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group3__Application|<font color="#ffffff" size=2><b>APPLICATION</b></font>]]
+
| style="padding:0.4em; font-size:100%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group_3_Application|<font color="#ffffff" size=2><b>APPLICATION</b></font>]]
  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
 
| style="border-bottom:4px solid #000066; border-top:4px solid #000066; background:none;" width="1%" | &nbsp;  
| style="padding:0.4em; font-size:90%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group3__Report|<font color="#ffffff" size=2><b>REPORT</b></font>]]
+
| style="padding:0.4em; font-size:90%; background-color:#000066;  border-bottom:4px solid #000066; border-top:4px solid #000066; text-align:center; color:#000066" width="10%" |[[Group_3_Report|<font color="#ffffff" size=2><b>REPORT</b></font>]]
 
|}
 
|}
 
<br>
 
<br>
 
=Abstraction=
 
=Abstraction=
  
Geospatial analysis was developed for problems in the environmental and life sciences, which has currently extended to almost all industries including defence, utilities, social sciences, and public safety.  The application of geo-visualization using geographically weighted regression (GWR) is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map. It is a good exploratory analytical tool which creates a set of location based parameter estimates, able to be mapped and analysed to give spatial information for the relationship of explanatory variables and response variable.
+
Geospatial analysis was developed for problems in the environmental and life sciences, which has currently extended to almost all industries including economy, defence, utilities, social sciences and public safety.  The application of geo-visualization using geographically weighted regression (GWR) is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map. It is a good exploratory analytical tool which creates a set of location based parameter estimates, able to be mapped and analysed to give spatial information for the relationship of explanatory variables and response variable.
Our study uses environmental data to explore air pollution impact in northern region of China. The project scope covers the analysis, model and visual representation of multivariate factors like wasted air emission, investment of air treatment, GDP in secondary industry, area of gardens and green, No. of motor vehicles, which leads to the air pollution in each city area of the province or municipality with the assistance of interactive charts and graphs.
+
<br>
 +
<br>
 +
Our study uses economy data to explore district GDP condition in east region of China. The project scope covers the analysis, model and visual representation of multivariate factors like GDP,Industry Output, Usual Residence,Average Wage,Area,City Construction Rate,No. of higher institution, and ratio of Teacher/Student which contributes to economical development in each city area of the province or municipality with the assistance of interactive charts and graphs.
 +
<br>
 +
<br>
 +
Conventional linear regression models are commonly used to analyse environmental problems to see the influences and basic relationship of factors which contributes to the economy. We feel that a geographical regression model can also be built to make spatial statistical analysis for the issues related to economy. GWR is a technique by allowing local instead of global parameters to be estimated. The model fits where a localized adjustment conveys a more meaningful message with involvement of spatial areas. It uses a moving window weighting mechanism for localised models detected at target location. Results are mapped to an interactive exploratory geo-application with the consideration of the nature of data spatial heterogeneity.
  
 
<br>
 
<br>
 +
 
=Motivation=
 
=Motivation=
  
Conventional linear regression models are commonly used to analyse environmental problems to see the influences and basic relationship of factors which contributes to the pollution.  We feel that a geographical regression model can also be built to make spatial statistical analysis for the issues related to environment.  GWR is a technique by allowing local instead of global parameters to be estimated. The model fits where a localized adjustment conveys a more meaningful message with involvement of spatial areas. It uses a moving window weighting mechanism for localised models detected at target location. Results are mapped to an interactive exploratory geo-application with the consideration of the nature of data spatial heterogeneity.  
+
Conventional linear regression models are commonly used to analyse economical problems to see the influences and basic relationship of factors which contributes to the economy.  We feel that a geographical regression model can also be built to make spatial statistical analysis for the issues related to economy.  GWR is a technique by allowing local instead of global parameters to be estimated. The model fits where a localized adjustment conveys a more meaningful message with involvement of spatial areas. It uses a moving window weighting mechanism for localised models detected at target location. Results are mapped to an interactive exploratory geo-application with the consideration of the nature of data spatial heterogeneity.  
 
<br>
 
<br>
 
<br>
 
<br>
Line 33: Line 39:
  
 
The following data sources will be used for the project: <br>
 
The following data sources will be used for the project: <br>
1. Air pollution data from National Bureau of Statistics of China between 2011 and 2015.<br>
+
1. Economy data from National Bureau of Statistics of China in 2015.<br>  
2. Air pollution data from online air quality monitoring platform between 2011 and 2015.
 
  
 
'''Variable'''         ''',Full Name'''                                ''',Unit'''<br>
 
'''Variable'''         ''',Full Name'''                                ''',Unit'''<br>
 
City                 ,City Name                                 ,-<br>
 
City                 ,City Name                                 ,-<br>
 
Province_Municipality ,Province or Municipality Name                 ,-<br>
 
Province_Municipality ,Province or Municipality Name                 ,-<br>
AQI                 ,Air Quality Index                         ,-<br>
+
GDP_billion_b            ,GDP Value                                 ,RMB (billion)<br>
IPT                 ,Investment of Pollution Treatment         ,RMB (million)<br>
+
Primary_Industry_b      ,Primary Industry Output Value                 ,RMB (billion)<br>
IWAE                 ,Industial Waste Air Emission(Sulphur Dioxide) ,Ton thousand<br>
+
Secondary_Industry_b    ,Secondary Industry Output Value         ,RMB (billion)<br>
RISWU                 ,Ratio of Industrial Solid Waste Utilized ,%<br>
+
Teriary_Industry_b      ,Teriary Industry Output Value                 ,RMB (billion)<br>
TRLW                 ,Treatment Rate of Living Waste         ,%<br>
+
Usual_Residence_k        ,No. of usual residence                 ,thousand<br>
GDPSI                 ,GDP of Secondary Industry                 ,RMB (million)<br>
+
Average_wage_RMB        ,Average wage                                 ,RMB (digit)<br>
NFTWA                 ,No of Facility for Treatment of Waste Air ,Set<br>
+
Area_sqkm                ,Total Area Size                         ,sqkm<br>
GIO                 ,Gross Industrial Output Value                 ,RMB (million)<br>
+
City_Construction_Rate  ,Rate of city construction                 ,%<br>
AGG                 ,Area of Garden & Green                 ,ha<br>
+
No_Higher_Institution    ,No. of higher institution                 ,digit<br>
NMV                 ,No of Motor Vehicle                         ,Unit thousand<br>
+
Teacher/Student          ,Ratio of teacher vs student                 ,%<br>
  
 
<br>
 
<br>
Line 54: Line 59:
 
=Approach=
 
=Approach=
  
The main framework used for the application is RShiny and Leaflets. There are some packages in R to build a smooth interactive interface for extracting, exploring, and modelling the data. We will implement GWR model and parallel coordinate graph for spatial analysis, which enables us to study the impact and influence of various factors contributes to the air pollution situation. The interactive graphs and charts will also been created to allow the user to apply different scenarios and make deep exploration.  
+
The main framework used for the application is RShiny and Leaflets. There are some packages in R to build a smooth interactive interface for extracting, exploring, and modelling the data. We implemented GWR model and parallel coordinate graph for spatial analysis, which enables us to study the impact and influence of various factors contributes to the local economical growth situation. The interactive graphs and charts has also been created to allow the user to apply different scenarios and make deep exploration.  
 
<br>
 
<br>
 
<br>
 
<br>
 
=Selection of Tools=
 
=Selection of Tools=
Since this course aims at pursuing the good command of R language, for the display part we will choose R to implement our visualization. To be more precise, we will use R Markdown and the other relevant package like gwr, ggplot2,ggraph, and JavaScript library like d3.js, leftlet.js. At the data preparation step, we will take advantage of JMP Pro and Tableau as well.
+
Since this course aims at pursuing the good command of R language, for the display part we will choose R to implement our visualization. To be more precise, we will use R Markdown and the other relevant package like spgwr, pryr, gwmodel, tidyverse, rgdal, RColorBrewer, lubridate, ggplot2,ggraph, and JavaScript library like d3.js, leftlet.js. At the data preparation step, we will take advantage of JMP Pro and Tableau as well.  
 
<br>
 
<br>
 
<br>
 
<br>

Latest revision as of 12:21, 3 December 2017

PROPOSAL   POSTER   APPLICATION   REPORT


Abstraction

Geospatial analysis was developed for problems in the environmental and life sciences, which has currently extended to almost all industries including economy, defence, utilities, social sciences and public safety. The application of geo-visualization using geographically weighted regression (GWR) is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map. It is a good exploratory analytical tool which creates a set of location based parameter estimates, able to be mapped and analysed to give spatial information for the relationship of explanatory variables and response variable.

Our study uses economy data to explore district GDP condition in east region of China. The project scope covers the analysis, model and visual representation of multivariate factors like GDP,Industry Output, Usual Residence,Average Wage,Area,City Construction Rate,No. of higher institution, and ratio of Teacher/Student which contributes to economical development in each city area of the province or municipality with the assistance of interactive charts and graphs.

Conventional linear regression models are commonly used to analyse environmental problems to see the influences and basic relationship of factors which contributes to the economy. We feel that a geographical regression model can also be built to make spatial statistical analysis for the issues related to economy. GWR is a technique by allowing local instead of global parameters to be estimated. The model fits where a localized adjustment conveys a more meaningful message with involvement of spatial areas. It uses a moving window weighting mechanism for localised models detected at target location. Results are mapped to an interactive exploratory geo-application with the consideration of the nature of data spatial heterogeneity.


Motivation

Conventional linear regression models are commonly used to analyse economical problems to see the influences and basic relationship of factors which contributes to the economy. We feel that a geographical regression model can also be built to make spatial statistical analysis for the issues related to economy. GWR is a technique by allowing local instead of global parameters to be estimated. The model fits where a localized adjustment conveys a more meaningful message with involvement of spatial areas. It uses a moving window weighting mechanism for localised models detected at target location. Results are mapped to an interactive exploratory geo-application with the consideration of the nature of data spatial heterogeneity.

Objective

Our objective is to provide an interactive and exploratory geo-visual analytics tool for regional and urban planner and policy maker to visualize, analyse and model the location-based data.

Data Source and Preparation

The following data sources will be used for the project:
1. Economy data from National Bureau of Statistics of China in 2015.

Variable ,Full Name ,Unit
City ,City Name ,-
Province_Municipality ,Province or Municipality Name ,-
GDP_billion_b ,GDP Value ,RMB (billion)
Primary_Industry_b ,Primary Industry Output Value ,RMB (billion)
Secondary_Industry_b ,Secondary Industry Output Value ,RMB (billion)
Teriary_Industry_b ,Teriary Industry Output Value ,RMB (billion)
Usual_Residence_k ,No. of usual residence ,thousand
Average_wage_RMB ,Average wage ,RMB (digit)
Area_sqkm ,Total Area Size ,sqkm
City_Construction_Rate ,Rate of city construction ,%
No_Higher_Institution ,No. of higher institution ,digit
Teacher/Student ,Ratio of teacher vs student ,%


Approach

The main framework used for the application is RShiny and Leaflets. There are some packages in R to build a smooth interactive interface for extracting, exploring, and modelling the data. We implemented GWR model and parallel coordinate graph for spatial analysis, which enables us to study the impact and influence of various factors contributes to the local economical growth situation. The interactive graphs and charts has also been created to allow the user to apply different scenarios and make deep exploration.

Selection of Tools

Since this course aims at pursuing the good command of R language, for the display part we will choose R to implement our visualization. To be more precise, we will use R Markdown and the other relevant package like spgwr, pryr, gwmodel, tidyverse, rgdal, RColorBrewer, lubridate, ggplot2,ggraph, and JavaScript library like d3.js, leftlet.js. At the data preparation step, we will take advantage of JMP Pro and Tableau as well.

Challenge

1.The preparation of dataset to fulfil the minimum model requirement
2.The interaction of R graphs and maps with Shiny.
3.The building of an effective and accurate GRW model in R.
4.Find the most suitable tools and libraries to implement the visual features.


Reference

[1] https://rstudio-pubs-static.s3.amazonaws.com/44975_0342ec49f925426fa16ebcdc28210118.html
[2] https://www.rdocumentation.org/packages/spgwr/versions/0.6-32/topics/gwr
[3] https://cran.r-project.org/web/packages/spgwr/vignettes/GWR.pdf
[4] http://csiss.org/GISPopSci/workshops/2011/PSU/readings/Grose-Brunsdon-Harris-GWR.pdf