The Indian Story Report

From Visual Analytics and Applications
Revision as of 00:40, 7 August 2017 by Sandhyavr.2016 (talk | contribs)
Jump to navigation Jump to search

Banner.png Group 9-The Indian Story

Project Proposal

Data Preparation

Poster

Application

Report


Motivation of the Application

In this era of increasing openness, the importance of information created or held by the government has become impossible to deny. Government is one of the largest producers of information in many areas, such as business information, health data, geographic data, census data, and legal information. Allowing the public to access public information has become an important objective in many of the latest data.gov initiative of countries worldwide. Despite the increasing availability of government data, the use of these data by the public is often hampered by a general lack of appropriate and unaffordable data exploratory and analysis tool. This is particularly true if the data is geospatial and high-dimensional in nature. In view of this, our project aims to design and develop a geo visual analytics tool for data discovery from geographically reference statistical data.

The application we developed is called CenViz. It is developed using R Shinny framework and several R data visualization packages such as tmpa, micromap and treemap. This presentation consists of four sections.First, the motivation and objectives of the project will be discussed.This is followed by a detailed discussion on the principles and concepts of micromap. After which, the R packages used to develop the application and the user-interface designed will be discussed. Using the latest census data of India, we will demonstrate how the functions of CenViz can be used to detect the geospatial patterns and attribute distributions of literacy in the country.

Review and critic on past works

Some of the existing studies include only the statistical maps which mostly give the distribution of literacy among states and cities, by gender, and education levels. Let us look at some of the existing visualizations.

Critic.pngTreemapCritic.pngScatter.png

  • The choice of a scatter plot here does not seem appropriate. To see growth, bar or line graphs seem more sensible.
  • Moreover, in the above graph, multiple states have the same color, and the legend is not well described.
  • While this is just one example of the study, in the overall study, there was no context provided. If one does not know India, the study is extremely difficult to relate to.
  • The aesthetics make it only harder to understand the insights further. The various visualizations were not well connected to build a story.

Therefore, in our study, we followed some important visual guidelines to ensure the graphs provide context, no junk charts, good amount of interactivity, and also involves referential statistical.

Design Framework

Totally there are three tabs which will provide various data visualisations from different perspective with the user in our application.

Tmap

Choropleth maps gave context to our story. Even the audience not aware of India and its states could easily visualize the geographical map clearly displaying states. We were able to encode population parameters into color hue - Minimal colors were used so the map is clear enough. These maps helped visualise which state is dominant for a particular category like “number of Male literate” or “number of female illiterate”. We built the choropleth maps using R's tmap package along with QGIS for simplifying shape files. A sample of a choropleth map is as shown below:

Picture1.jpg

The tab in the Shiny app that shows Choropleth is as shown:

ChoroplethTab.jpg

As seen above in the diagram, the maps show the education level performances of each state by age group which can be further filtered by Gender. The bar plot gives us the distribution of population by States in India.

Treemap

Treemap is most appropriate for hierarchical data. Certain packages in R related to treemap adds interactivity to the data and helps drill down through the hierarchy.

Treemap in the given context allows users to select a given state and then drill down further to town level literacy rates for various categories (e.g. Gender - Male, Education Level - Graduates). Size of the treemap shows population, colour is represents literacy rates(Education Level). R’s d3TreeR package for creating interactive tree maps were used to build the treemaps, and Tidyverse and tidy r packages for modifying the data.

A sample treemap that shows all states in India which can be further drilled down to city level is as shown below. Cenviz tab (on the right) that helps interact with treemap.

Treemap.jpg
InteractiveTreemap.jpg

Micromap

Micromaps helps understand the distribution of education levels across cities per state. It not only adds statistical inferences, but also gives a geographical context. In the given context, for example,for “Males”, “Females” and “all” it plots the box plot of literacy rates based on town-wise literacy rate data. It gives a look “into” the states themselves that helps us understand what cities cause a skewed distribution, thereby portraying different results than the actual. R’s micromap package for creating the choropleth, QGIS for simplifying the shape files were used to build micromap.

A sample micromap is as shown below.The tab on Cenviz that helps interact with the micromap is shown on the right.

Micromap.jpg
InteractiveMicromap.jpg


Demonstration

Tmap One of the use cases that helps best evaluate out Tmap is for the users to select 'Persons' in Gender filter, and 'Graduates' in Education Level. The Tmap for the age group Teen and Below should be empty with no colors signifying, no Teen and Below aged person can be a Graduate. The result of the search is shown below.

TeenAndBelow.jpg

Micromap Let us take the same scenario of studying Micromaps for Graduates. Uttarakhand seems to have high number of graduates. To support this result, when we look at the corresponding Treemap of the state Uttarakhand, the city Roorkee has the highest number of graduates. IIT Roorkee being one of the finest institutes in India supports the hypothesis further. The result of the stated use case is shown below. MicromapUseCase.png MicromapUseCase1.png


Treemap

TreeMapDemo.png TreeMapDemo1.png



Discussion

  • Top 5 populous states: Maharashtra, Uttar Pradesh, Andhra Pradesh, Gujarat, West Bengal.
  • Highly illiterate cities: Rampur, Amroha, Sambhal, Bagaha.
  • Most literate state: Kerala.
  • Most graduates - especially IT are moving towards the IT triangle in the South - Bangalore, Hyderabad, and Chennai.
  • No of graduates have tremendously increased in 20 years which is a clear representation of India growing!


Future Work

1) For the state-wide view (four map and one bar chart), since four variables are used, the more appropriate data visualisation will be parallel coordinates or heatmap. This can be considered for future work.
ChoroplethTab.jpg

2) Micromap package has a limitation that the micromap cannot be embedded into the Shiny app. However, it can be displayed as pop-up from the Shiny app.
InteractiveMicromap.jpg

Installation Guide

The installation guide is introduced in detail at: https://wiki.smu.edu.sg/1617t3isss608g1/The_Indian_Story_Application

User Guide

1) Go to CenViz application link - https://mandiluo.shinyapps.io/The_Indian_Story/

2) The tab ‘State-wise’ provides Choropleth visualizations that provides context to data.

3) There are 2 filters - Gender and Education Level. Click on the corresponding filters and click on ‘Search’ button to see the relevant Tmap result.

4) Navigate to the next tab ‘City-level’ that provides a Treemap per state, further drilled down to City level.

5) There are 3 filters – Gender, Age Group and Education Level. Click on the corresponding filters and click on ‘Search’ button to see the relevant Treemap result.

6) The next tab is ‘States-Education-Level-Comparison’ which provides Micromap visulaization. This tab is not interactive owing to the limitation of the Shiny server that does not support the popup functionality used to display Micromap on Shiny. However, a detailed guideline to interact with Micromaps is explained in the Installation Guide.