Group08 Report

From Visual Analytics and Applications
Jump to navigation Jump to search

Econometric Modeling with Gender Equality and Women Empowerment


Proposal

Poster

Application

Report

Back to Main ↗


Introduction

As civilisation advances across the world and nations’ priorities shift from economic survival to economic growth and creation of wealth, people are awakened to consequences of marginalisation and inequalities left behind in the wake of economic progress. Increasingly, questions shift towards how to address or mitigate these risks, balancing these trade-offs, and even whether tackling them upfront may even be beneficial and facilitate economic growth. One of the biggest inequalities of our time is in gender inequality. Gender equality has been touted as a strong way to unlock greater economic potential for countries, by various leading organisations – the World Economic Forum made a case for gender equality, the Economist highlights how inequality affects growth , and the OECD illustrates how growth gaps can be attributed to gender inequality . Intuitively this makes sense since labour inputs is one of the primary drivers of economic growth, and women effectively represented half the population’s potential for labour input.

Economic growth aside, gender inequality also has severe social repercussions as we have observed in the recent rise of the #MeToo movement. All these underscores the importance of better understanding gender inequality and how to address it.

The purpose of this project thus aims to look at existing economic and development data and lend new or reinforce existing perspectives through a data visualisation application which encompasses econometric models.

Data Review

The World Bank stores an enormous amount of data across more than 200 countries and over 1,300 indicators under its ‘World Development Indicators’ database.

Much of this information covers mainly economic related data for each country, over a long period time from 1972 to the latest 2016 data. The gamut of topics covers an extremely wide range, including – GDP, income, inequality, population indicators such as age, health, labour statistics, energy use, financial industry information, even environmental impact and other factors.

Challenges

Given the vast breadth and depth of coverage, undertaking any project that leverages on this data can require a potentially large menial effort in sieving and gathering the data. For the purposes of the topic, we needed to first narrow down the indicators to those that may be relevant to women’s empowerment and gender equality to create a base dataset for analysis. The selection of the variables was done qualitatively and reduced to 47 identified potential indicators.

Next, a preview of the data reveals sparsity issues on two fronts – on time-period and frequency, as well as available countries. Several of these indicators had data that was only collected from a later stage rather than since the start of data collection by the World Bank. And even after data collection had commenced, it was not necessarily done on a sufficiently regular basis (typically yearly, as with all the indicators). This was understandable as the awareness and attention to this area had only seen some impact on data collection activities in the past couple decades. At the same time, the regimen of consistent collection of such data may not have been consistently enforced across all participating countries, which resulted in a glaring gap in yearly data for a large majority of countries.

Data Reduction

For consistency, only countries with enough yearly data points were picked out, and indicators with the same set of data sparsity (or rather, lack thereof) were used. Eventually this was whittled down to 21 indicators for 31 countries, and the time horizon reduced to the period from 2000-2016.

G8 Table1.jpg

Final dataset structure

After data selection process, some basic transformation was performed to enable greater ease of usage in the subsequent analysis. Some additional factors relating to the data were also added to present a dimension to the countries for use in data clustering.


Designing the Application

The primary aim of the visual application is to provide a structured method to visualising and understanding data, so it can help to enable discovery of new perspectives or reinforce existing theories on gender equality. In designing this piece, therefore, an important aspect was high flexibility. Users need to be able to go through data exploration and discovery, run some basic techniques to better help them decide on features that will be useful, before eventually building the model and gathering results – all within the same application.

To enable this, we first constructed a basic hierarchical map of the functions we wished to include. The structure would include elements of exploratory data analysis as well as econometric modelling, as follows:

G8 AppMap.jpg

Dashboard Design

Next, we embarked on design of the dashboard, using a basic storyboard to layout the intended functions in a logical, user-centric manner to aid navigation. While the eventual application went through multiple iterations to look different from the original design, the initial process helped to better structure and crystallise our thoughts in a visual format. It also acted as a useful reference point to keep our intentions focused.


Application Functions

Based on the described design, we set out to build the application in R Shiny, combined with various packages to support the different intended functions. The following sections detail each of these functions, the respective packages used, as well as the rationale.

Indicator display

The first part of our application focuses on the showcase of indicators used in the analysis that follows; what type of structure does the data follow, what countries were selected, and eventually what variables are used for modelling.

One of the most basic views is a straightforward data table, which is generated using the package DT. This is shown below.

G8 img1.jpeg


What follows next is a time-series display of each indicator, showing in country level, region level and income group level.

Naturally, we leveraged on Shiny to enable selection of variables, as well as enabling a high-level filtering by way of the geographic regions which is available in our base data.

As a first step, the user can choose the indicator they wish to explore. Next, they can select the region and country from the drop-down box, as well as the income level.

G8img2.jpg

The above illustrates the view displayed to the right of the selectors. The top two shows the trend of the selected country compared to that of overall region. The selected country is highlighted by the blue line, against a backdrop of other countries in the region which are in light gray. The right chart provides a better view of the indicator on its own, with the y-axes rescaled to fit the maximum and minimum values observed. This approach allows the user to more quickly detect fluctuations visually in a more obvious manner, as compared to countries which look relatively stable compared to others with wider variance of change over the years. This same behaviour is also adapted for the ‘income level’ tab.

The line charts were created with the R package plotly (the choice of visualisation being the most commonly preferred method to look at time-series data). Using plotly has a lot of benefits. It enabled a smart highlighting of the selected country against a backdrop of other countries within the selected region. In addition, the nifty tool-tip helped provide a quick view of the key data when users hovered their mouse over the line or each specific point on the graph.

section

placeholder


Acknowledgements

We wish to thank Professor Kam Tin Seong for his strong support and close inputs to improve on our application. The multiple iterations and improvements would not have been possible without his valuable advice.