Group12 Report

From Visual Analytics and Applications
Jump to navigation Jump to search

O1.gif  Have the Nations really progressed ?

About Us

Proposal

Poster

Application

Report

Project Groups


Introduction

World Development Indicators (WDI) is the primary World Bank collection of development indicators, compiled from officially recognized international sources. It presents the most current and accurate global development data available including national, regional and global estimates. It covers more than 7 million data points collected over the span of 58 years. This statistical reference includes over 1500 indicators covering more than 200 economies. The annual publication is released in April of each year.
The massive amount of world development data has by far exceeds the ability for students, policymakers, analysts and officials to transform the data into proper visualization for analysing and gaining insight of the global developmental landscape. Thus, creating an adverse impact on the financial and technical assistance World Bank is providing to the developing countries around the world.
Through our visualizations, we seek to utilize existing data to derive meaningful insights over how various socioeconomic factors have had an impact on development of different nations and to tell their story of growth and downfall across years. This dashboard also helps decide on various areas the countries need help on and has the aid provided earlier has any effect or not. The Key objective is to deep dive in a countries development across 9 parameters.

Motivation and Objective

Our main objective is to utilize visual and graphical techniques in R for preparation of a user-friendly dashboard. Countries requiring aid from various organizations would receive benefit from this analysis by their performances being gauged over the years from which they first received assistance. Furthermore, analysts get the option to switch between different forms of visualization to allow them to make better financial and technical decision in helping the developing countries. This application should enable financial aid providers to decide on funding future aid or ROI on existing aid given.

Previous Works

Sustainable Development Goals


Samimage0.png

> World Bank website has a dashboard in place with respect to Sustainable Development goals, which contains more than 15 different goals to visualize for all the countries in view of different measures on a line chart covered over various regions. The dashboard is too complicated and requires an expert view of the exact goal to visualize.

CRAN Download Monitor

Samimage1.png

For our dashboard design, we were inspired by the CRAN Download monitor to perform reactive output based on input parameters.

Google Charts Demo

Samimage2.png

We attempted to create the charts with an easy interface for users to see the aesthetics in an appealing manner.

Dataset & Preparation

Raw Data

  • We exported the World Development Index Data. from world bank database from their World Development Indicators and Sustainable Development Goals section with 1580 parameters across 58 years and 217 countries. The data set was very large (19.9 Million Rows) and many metrics did not make sense. We selectively filtered all the measures into 9 categories we thought are the most impactful. We then merged the data into a single file across 58 years and the UN recognized 201 countries.

Samimage3.png
  • The series originally consisted of a conglomeration of all the demographic, education, finance, economic, public sector, health etc without any categories. Using JMP and Excel, the data was thoroughly organized and wrangled to make use for analysis. We reduced the key data points from 1580 to 58. We identified 9 main categories to spread out those 58 parameters we now call the KPIs. These KPIs (Key Performance Indicators) are the important measure of development across these 9 sectors/categories.

Main Categories

Samimage32.png

Each of these categories have 6-8 metrics we call as a KPI (Key Performance Indicator). Each of these KPIs indicate the well being of a nation when compared to each other for that category. These KPIs were carefully selected on the basis of their impact on the country’s development. A mapping is given below:

Category Mapping
Measures Categories
Bird species, threatened Climate
Forest area (% of land area) Climate
Mammal species, threatened Climate
Methane emissions (kt of CO2 equivalent) Climate
Plant species (higher), threatened Climate
PM2.5 air pollution, mean annual exposure (micrograms per cubic meter) Climate
Central government debt, total (% of GDP) Debt
Consumer price index (2010 = 100) Debt
Inflation, consumer prices (annual %) Debt
Multilateral debt (% of total external debt) Debt
Net official aid received (current US$) Debt
Short-term debt (% of total external debt) Debt
Agriculture, value added (% of GDP) Economy
Ease of doing business index (1=most business-friendly regulations) Economy
Income share held by highest 10% Economy
Income share held by lowest 10% Economy
Industry, value added (% of GDP) Economy
Manufacturing, value added (% of GDP) Economy
Market capitalization of listed domestic companies (current US$) Economy
Official exchange rate (LCU per US$, period average) Economy
Services, etc., value added (% of GDP) Economy
Educational attainment, at least Bachelor's, population 25+, total (%) Education
Educational attainment, at least primary, population 25+ years, total (%) Education
Government expenditure on education, total (% of GDP) Education
Literacy rate, adult total (% of people ages 15 and above) Education
Employment in agriculture (% of total employment) (modeled ILO estimate) Employment
Employment in industry (% of total employment) (modeled ILO estimate) Employment
Employment in services (% of total employment) (modeled ILO estimate) Employment
Labor force participation rate, total (% of total population ages 15-64) Employment
Labor force, female (% of total labor force) Employment
Labor force, total Employment
Listed domestic companies, total Employment
Unemployment, total (% of total labor force) (national estimate) Employment
Access to electricity (% of population) Energy
Alternative and nuclear energy (% of total energy use) Energy
Electric power consumption (kWh per capita) Energy
Electricity production from renewable sources, excluding hydroelectric (% of total) Energy
Pump price for gasoline (US$ per liter) Energy
Renewable energy consumption (% of total final energy consumption) Energy
Birth rate, crude (per 1,000 people) Health
Current health expenditure (% of GDP) Health
Death rate, crude (per 1,000 people) Health
Hospital beds (per 1,000 people) Health
Life expectancy at birth, total (years) Health
Physicians (per 1,000 people) Health
GDP (current US$) KPI
GDP growth (annual %) KPI
GDP per capita (current US$) KPI
GDP per capita growth (annual %) KPI
GDP, PPP (current international $) KPI
Net migration Population
Population ages 0-14, total Population
Population ages 15-64, total Population
Population ages 65 and above, total Population
Population density (people per sq. km of land area) Population
Population growth (annual %) Population
Population, total Population
Urban population (% of total) Population

JMP

  1. We used JMP to refine the measures from 1500 to 227 variables in Phase one of cleaning up. And then further eliminating the unimportant columns to arrive at 58 in phase 2.

  2. We used JMP to add category. Each of these 227 in Phase 1 and then finally 58 in Phase 2, we created a broad buckets listed earlier: Population, Climate, Energy, Education, Employment, Economy, Debt, GDP and Health. We then placed each of those 58 columns transposed into a single column.

    Samimage5.jpg
  3. JMP was also used for Transposing the data sets from the existing columnar format to a single column for KPIs.

Excel

Finally we cleaned up the final data set of ~700K records (201 X 58 X 58) and saved it as a Excel file. We then created regions, data type and category columns. Regions is a grouping of countries. Data type is either # or %. Category columns is a grouping of KPIs. So we have 7 columns in total:

Samimage77.png

We save the final data as a CSV file.


Visual Design Framework

R Shiny

We then save the final file in csv format and loaded into R shiny application for generating the graphs. We have used dplyr library to use dataset containing values greater than zero.
We added interactive chart features using R Shiny.
Libraries used:

  1. Ltidyverse
  2. sf
  3. tmap
  4. classInt
  5. shiny
  6. leaflet
  7. ggplot2
  8. shinydashboard
  9. maps
  10. plotly
  11. dplyr
  12. shinyWidgets
  13. treemap
  14. treemapify
  15. gridBase
  16. RColorBrewer

Trend charts, Treemaps and Geographical Map charts for each of the selected 9 categories to give a full perspective of the metrics.
The three main charts are:

<Line chart for showcasing time series trend based on years selected and country(ies) selected

Samimage7.png

The trend view is useful to see a time series analysis of a given metric for a given country(ies)
Features of line chart:

  • Plotly used -

Date filter

  • Animated slider input. (with play & pause)
  • Range of years selected.
  • Default range is 15 years. (Can be decided by user)

Country filter

  • Singapore is the default country.
  • Multiple country selection support.
  • Support for removing countries via mouse click.
  • Countries are colour coded.
  • Each line corresponds different country.
  • The first country selected will act as the comparison for other countries.

Measure filters (KPI)

  • Each measure will correspond to the year range and country selected.
  • The data value for the measure will be displayed on the y-axis of the chart.
  • Plotly function to compare values among different countries for the year.
  • Hover function achieved through plotly.

Three Value boxes to highlight selection

  • The years being used.
  • The KPI measure being used.
  • The first country selected, on which other countries are being compared with.

Treemap for viewing all the countries in globe or region based on the measures


Samimage8.png

Geo MAP with colour shaded for each country based on KPI selected

Samimage9.png
Samimage10.png

We have used the combination of leaflet and tmap to visualize the interactive globe map in r shiny. We used the global boundary shape file to execute the map.

Dashboard

We take a visual analytics approach to measure KPI's across different parameters to gauge the progress of all 201 countries over the last 58 years across a selected metrics using interactive charts and filters.
Lines charts can be used for comparative feature across multiple countries and view a trend across a given KPI.
Treemaps show a share of countries on a specific metric. This is very useful for comparing lots of countries in a single chart. We have three comparisons:

  • Grouped by Region
  • Grouped by KPIs
  • Grouped by Categories

We use Geographic Maps to visually showcase the country performance in positive or negative across last 10 years. The red and green is based on percentile, hence relative for that KPI for that period and hence very valid comparison.
We have two visualizations in Geo Map: KPI and category. This shows a comparison of countries.

Filters

  1. Year filter is a slider/Dropdown, in some cases we have a start and end year in the slider.
  2. Region filter: this will be multi select for better interactivity. This will be a nested filter for countries.
  3. Country filter: this will be multi select for better interactivity.
  4. Categories filter: This will be a nested filter for KPIs.
  5. KPI filter: This is the key selection metric for all charts.

For the filters to work, we used Reactive:

Samimage11.png

Reactive subsets the data based on input parameters on which interactive filters are based. This was the most important function we used.

Visualizations & Insights

Time Series Chart


Samimage12.png


We analysed the trend between two developed port nations Singapore & Hong Kong. We observe that at 2003, their GDP per capita figure was near identical at USD 23K. As the years progressed, we find that Singapore saw a brief exponential growth resulting in a difference of almost USD 20K difference between their GDP per capita income values.

Samimage13.png


Next, we were curious about the employment percentage in the industry sector between Singapore and India. Surprisingly, we found a declining trend in Singapore’s employment metric from the year 2011.
We can select years along with multiple countries as a filter condition to compare the time series trend to evaluate many growth parameters.

Treemap Visualization

Samimage14.png


Samimage15.png


Samimage16.png


For the treemap, the application uses interactive features between category and year.

Geographical Map Visualization

Samimage17.png


Samimage18.png


The geographical map uses interactive features between year, parameters and region.

Conclusion

While GDP provides an important point of reference for analysis of a country’s overall economic development, it does not reveal any specific information about sectoral composition and the different degrees of industrial development. Countries show profound structural differences which tend to relate to their stage of overall economic development and the difference contribution of the various sectors (agriculture, industry – and manufacturing as part of it – and services) their economic system is composed of. To capture the different levels of countries’ industrial development, we make use of cross-measure analysis of various domains to better investigate the progress of nations. Deep dive and drill down features further aids and measures the nations performance on all the 9 indicators developed & then a conclusion can be reached.

Project Challenges, Limitations and Future Work

The application to visualize the World Development Indicators are currently having the following limitations:

Challenges faced

  1. Nested filters implementation R shiny Dashboard.
  2. Creation of Bubble Charts in R Shiny dashboard.
  3. Integration of filters across all the tabs in R Shiny.
  4. Treemap integration (conditional panel), we could not add tabs.

Future work

  1. Reference lines (such as an industry benchmark or future targets) to compare the countries performance.
  2. Usage of a bubble chart and allow users to select 4 parameters to generate the graph: X axis, Y axis, Size of Bubble and Colour of bubble. This should be user generated or self creating chart.
  3. Stacked bar chart to show the trend of a group of components adding up to a 100% such as % contribution of energy production, or % GDP contribution between Services, Manufacturing, Industry and Agriculture.
  4. Reactive or Nested Filter support for Category and regions for KPIs and Countries respectively.Grouping of countries within Regions, along with Grouping of measures.


Use cases and benefits of this dashboard

  1. Financial Aid organizations such as World Bank, IMF, International reconstruction bank for determining ROI of aid in a particular sector.
  2. To decide if a country needs financial aid to help development of a country.
  3. To slice and dice using the 58 KPIs in 9 categories.
  4. To compare Country performance against another country for any given measure.
  5. To compare historical performance within the same country.


Acknowledgement

The authors wish to thank Dr. Kam Tin Seong, Associate Professor of Singapore Management University, School Information Systems (Practice), for his mentorship and guidance in completion of this visualization project.

References

  1. En.wikipedia.org. (2018). World Development Indicators. [online] available-at: https://en.wikipedia.org/wiki/World_Development_Indicators
  2. Datatopics.worldbank.org. (2018). WDI: Sustainable Development Goals. [online] Available at: http://datatopics.worldbank.org/sdgs/
  3. Vallandingham, J. and Vallandingham, J. (2018). Getting Started with Shiny - data visualization - Bocoup. [online] Bocoup.com. Available at: https://bocoup.com/blog/getting-started-with-shiny
  4. Rstudio.github.io. (2018). Shiny Dashboard Structure. [online] Available at: https://rstudio.github.io/shinydashboard/structure.html
  5. Cyberhelp.sesync.org. (2018). Interactive Web Applications with Shiny. [online] Available at: http://cyberhelp.sesync.org/basic-Shiny-lesson/
  6. Datascience-enthusiast.com. (2018). Interactive visualization with R-Shiny versus with Tableau: Treemaps. [online] Available at: https://datascience-enthusiast.com/R/R_shiny_Tableau_treemap.html