Difference between revisions of "Group08 Proposal"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 1: Line 1:
<div style="background:#E4EBF0; padding-left:15px; text-align:center;">
+
<!----------- Main Header ------------>
<font size = 5><br/></font>
+
<div style="background:#E4EBF0; padding:24px; text-align:center;">  
<font size = 8; color="#4180AB"><span style="font-family:Segoe UI Light;">Understanding gender equality from a visual perspective<br/></span></font>
+
<font size = 8; color="#4180AB"><span style="font-family:Segoe UI Light;">Understanding gender equality from a visual perspective</span></font>
<font size = 5><br/></font>
 
 
</div>
 
</div>
  
<!--MAIN HEADER -->  
+
 
{|style="background-color:#ffffff;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |  
+
<!------------ Navigation bar ------------>  
 +
{|style="background-color:#ffffff;" width="100%" |  
 
| style="font-family:Segoe UI Light; font-size:100%; text-align:center;border-bottom:solid #E4EBF0" width="16.6%" |   
 
| style="font-family:Segoe UI Light; font-size:100%; text-align:center;border-bottom:solid #E4EBF0" width="16.6%" |   
 
[[Group08 Overview | <font size = 4; color="#BDD1DE">Overview</font>]]
 
[[Group08 Overview | <font size = 4; color="#BDD1DE">Overview</font>]]
Line 25: Line 25:
 
[[Project Groups| <font size = 4; color="#BDD1DE">Back to Main</font>]]
 
[[Project Groups| <font size = 4; color="#BDD1DE">Back to Main</font>]]
  
|  &nbsp;
 
 
|}
 
|}
 +
<!------------ End of navi bar ------------>
 +
 +
 +
 +
<!------------ Background ------------>
 +
<div style="font-family:Segoe UI Light; font-size:100%; padding: 0px 0px 0px 15px;">
 +
<font size = 5; color="#4180AB">Background</font>
 +
<div style="font-family:Segoe UI;">
 +
<font size = 2; color ="#1B325F">
 +
This page is work in progress <br>
 +
lorem ipsum placeholder text for now
 +
</font></div></div>
 +
 +
 +
<!------------ Data Sources ------------>
 +
<div style="font-family:Segoe UI Light; font-size:100%; padding: 40px 0px 0px 15px;">
 +
<font size = 5; color="#4180AB">Data Sources</font>
 +
<div style="font-family:Segoe UI;">
 +
<font size = 2; color ="#1B325F">
 +
The World Bank is a global partnership between 189 countries across the world that seeks to "reduce poverty and build shared prosperity in developing countries". While it regularly collects information and publishes reports toward its work, it also hosts its data on [https://data.worldbank.org/ data.worldbank.org] in a bid to provide free and open access to development data across the globe.
 +
 +
As a significant area of our topic covers economic indicators as proxies of women empowerment, the World Bank database serves as a primary source of data for us to work with.
 +
</font></div></div>
 +
 +
 +
<!------------ Methodology ------------>
 +
<div style="font-family:Segoe UI Light; font-size:100%; padding: 40px 0px 0px 15px;">
 +
<font size = 5; color="#4180AB">Approach</font>
 +
<div style="font-family:Segoe UI;"><font size = 2; color ="#1B325F">
 +
We will be exploring the following few approaches to the analysis and visualisation of the data:
 +
 +
 +
<div style="font-family:Segoe UI Semibold;"><font size = 2; color="#4180AB">Exploration</font></div>
 +
* Illustrate every independent variable’s changing pattern by countries
 +
* Illustrate changing patterns in indicators of women empowerment for various countries or groups of countries
 +
* Detect any exceptional variations or trends for further analysis
 +
 +
 +
<div style="font-family:Segoe UI Semibold;"><font size = 2; color="#4180AB">Multi-variate linear regression</font></div>
 +
With some reorganisation, the data can be structured into 3 dimensions of country (region), time (year), and indicators for the purpose of multi-variate linear regression.
 +
<blockquote><p>Y<sub>i</sub> = β<sub>0</sub> + β<sub>1</sub>X<sub>i</sub> + β<sub>2</sub>X<sub>2i</sub> + ... + β<sub>n</sub>X<sub>ni</sub></p></blockquote>
 +
 +
* Y represents the dependent variable which is an indicator of women empowerment, while X denotes dependent variables influence Y values. We have identified female labor force participation rate (or calculated Non-agricultural female employment rate) as one such potential indicator
 +
* The model will be used to evaluate various X variables and generate a list of featured variables of high importance. Statistical significance and adjusted R-squared values will be key deciding factors to help us exclude irrelevant variables
 +
* Set assumptions: Classical Hypothesis, significance level, OLSE (ordinary least squares estimation); set applicable testing methods for model updating: goodness of fit, F-test, T-test, multicollinearity, etc.
 +
* With the model, we will perform cross-comparison between countries and regions, and identify potential segregation factors such as rate of development, geographical, environmental etc.
 +
 +
 +
<div style="font-family:Segoe UI Semibold;"><font size = 2; color="#4180AB">Time series forecasting</font></div>
 +
* Capture important features (significantly impact Y) of each year for every country in each group
 +
* Probably show how impact factors change as time goes by for countries
 +
* Set customized tolerance level (low, medium, high)
 +
* Forecast the next 1 or 2 year Y value(s) for each selected country based on input variables (important features) using ARIMA (Auto-Regressive Integrated Moving Average) algorithm or Xgboost
 +
* Model performance evaluation
 +
</font></div></div>
 +
 +
 +
<!------------ Interface Design ------------>
 +
<div style="font-family:Segoe UI Light; font-size:100%; padding: 40px 0px 0px 15px;">
 +
<font size = 5; color="#4180AB">Design Approach</font>
 +
<div style="font-family:Segoe UI;"><font size = 2; color ="#1B325F">
 +
The approach to the design principles are outlined under each area:
 +
 +
 +
<div style="font-family:Segoe UI Semibold;"><font size = 2; color="#4180AB">Data visualisation</font></div>
 +
* Animation to show the impact factors changing year by year automatically
 +
* Good viz way to show the target changing patterns by time and comparison among countries
 +
* Good viz way to show variables text info, country info, outliers and so on
 +
 +
 +
<div style="font-family:Segoe UI Semibold;"><font size = 2; color="#4180AB">Application Interface</font></div>
 +
* Allow users to select their tolerance for forecasting near future women empowerment measure
 +
* Allow users to select time and country for indicators they want to explore under the filters and embed controls to avoid choosing too many parameters in one time
 +
* Allow users to customize variables to be input in the model so that they may directly see the difference of target results between their input and  important features input
 +
* Creative emoji embedding to show any text info
 +
* User friendly designing with some interesting visual impact 
 +
</font></div></div>
  
<font size = 4; text-align:center;>---This page is work in progress---</font>
+
 
 +
<!------------ References ------------>
 +
<div style="font-family:Segoe UI Light; font-size:100%; padding: 40px 0px 0px 15px;">
 +
<font size = 5; color="#4180AB">References</font>
 +
</div>

Revision as of 00:21, 12 June 2018

Understanding gender equality from a visual perspective


Overview

Proposal

Poster

Application

Report

Back to Main


Background

This page is work in progress
lorem ipsum placeholder text for now


Data Sources

The World Bank is a global partnership between 189 countries across the world that seeks to "reduce poverty and build shared prosperity in developing countries". While it regularly collects information and publishes reports toward its work, it also hosts its data on data.worldbank.org in a bid to provide free and open access to development data across the globe.

As a significant area of our topic covers economic indicators as proxies of women empowerment, the World Bank database serves as a primary source of data for us to work with.


Approach

We will be exploring the following few approaches to the analysis and visualisation of the data:


Exploration
  • Illustrate every independent variable’s changing pattern by countries
  • Illustrate changing patterns in indicators of women empowerment for various countries or groups of countries
  • Detect any exceptional variations or trends for further analysis


Multi-variate linear regression

With some reorganisation, the data can be structured into 3 dimensions of country (region), time (year), and indicators for the purpose of multi-variate linear regression.

Yi = β0 + β1Xi + β2X2i + ... + βnXni

  • Y represents the dependent variable which is an indicator of women empowerment, while X denotes dependent variables influence Y values. We have identified female labor force participation rate (or calculated Non-agricultural female employment rate) as one such potential indicator
  • The model will be used to evaluate various X variables and generate a list of featured variables of high importance. Statistical significance and adjusted R-squared values will be key deciding factors to help us exclude irrelevant variables
  • Set assumptions: Classical Hypothesis, significance level, OLSE (ordinary least squares estimation); set applicable testing methods for model updating: goodness of fit, F-test, T-test, multicollinearity, etc.
  • With the model, we will perform cross-comparison between countries and regions, and identify potential segregation factors such as rate of development, geographical, environmental etc.


Time series forecasting
  • Capture important features (significantly impact Y) of each year for every country in each group
  • Probably show how impact factors change as time goes by for countries
  • Set customized tolerance level (low, medium, high)
  • Forecast the next 1 or 2 year Y value(s) for each selected country based on input variables (important features) using ARIMA (Auto-Regressive Integrated Moving Average) algorithm or Xgboost
  • Model performance evaluation


Design Approach

The approach to the design principles are outlined under each area:


Data visualisation
  • Animation to show the impact factors changing year by year automatically
  • Good viz way to show the target changing patterns by time and comparison among countries
  • Good viz way to show variables text info, country info, outliers and so on


Application Interface
  • Allow users to select their tolerance for forecasting near future women empowerment measure
  • Allow users to select time and country for indicators they want to explore under the filters and embed controls to avoid choosing too many parameters in one time
  • Allow users to customize variables to be input in the model so that they may directly see the difference of target results between their input and important features input
  • Creative emoji embedding to show any text info
  • User friendly designing with some interesting visual impact


References