Group23 Proposal

From Visual Analytics and Applications
Revision as of 01:13, 18 June 2018 by Yunxia.liao.2017 (talk | contribs) (Created page with "<div style="background:#81DAF5; border:#002060; padding:24px; text-align:center;"> <font size =8; color="#FFFFFF"><span style="font-family:Segoe UI;">MAKE THE WORLD A BETTER...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

MAKE THE WORLD A BETTER PLACE TO “BREATHE”


INTRODUCTION

PROPOSAL

REPORT

POSTER

APPLICATION

Background

Every year, World Bank website will publish updated World Development Indicator that they’ve collected from multiple channels. That will be the core data resource for our analysis. The dataset contains 1,591 indicators among 263 countries. Indicators include aspects like environment, economic policy & debt, infrastructure, financial sector, public sector, private sector & trade, social protection & labour, education, health, gender, poverty and social protection & labour.

In our cases, some of the indicators can be used as a reference to air pollution, like CO2 emissions, PM2.5 air pollution, Nitrous oxide emissions etc. And we will not exclude any of those indicators without statistically proven insignificant.

Data Preparation:
The raw data is presented in the form stacked with countries, for time series analysis, we need to transform it into long format, where there will only be 3 columns: country, indicator and value.

Methodologies and Techniques

Descriptive Statistics

The data will be grouped into regions and different time periods. Descriptive statistics will help to differentiate among regions during each period, or whether it’s going through industrialization, wars or even civilization revolutions.


Variable Selection & Clustering

For now, there are 1,591 indicators in our dataset, intuitively, some of the indicators are highly correlated or even resemble. Variable selection is necessary or else the whole analysis may bias towards some highly-weighted variables.

To eliminate highly-correlated variables, correlation matrix and stepwise regression may come in handy. Also, application of variable clustering will help us to reduce dimension of variables and access us to measure attributions altogether.


Time series & Panel Analysis

To monitor on sensitivity of air quality to each indicator in time, panel analysis may be applied to corresponding changes. With panel analysis results, we will be able to quantify how does selected factors affect air quality at a certain level.


Geographical Visualization

Geographical visualization will give us a clearer picture of how air pollution distributed around the world and, enable us to detect details or patterns of migration.