Difference between revisions of "Group04 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 93: Line 93:
 
Crop Production data is obtained from “State-wise, season-wise crop production statistics from 1997” available on https://data.gov.in/.  Data set contains season wise crop production data for 646 districts from 1997 to 2015 for 113 different crops. For this research, only those crops are selected for which districtwise data is available for more than 10 years. So final selected subset contains data for 56 crops for 595 districts. Crop productivity is calculated using below formula –
 
Crop Production data is obtained from “State-wise, season-wise crop production statistics from 1997” available on https://data.gov.in/.  Data set contains season wise crop production data for 646 districts from 1997 to 2015 for 113 different crops. For this research, only those crops are selected for which districtwise data is available for more than 10 years. So final selected subset contains data for 56 crops for 595 districts. Crop productivity is calculated using below formula –
  
[[File:Group4_formula.png|centre]]
+
[[File:Group4_formula.png|Left|200px]]
  
  
Line 105: Line 105:
  
  
<!--Critique of Existing Visualization-->  
+
<!--Critique of Existing Visualization-->
 +
 
 
==Critique of the Existing Visualizations==
 
==Critique of the Existing Visualizations==
 
{| class="wikitable"
 
{| class="wikitable"

Revision as of 17:42, 14 August 2018

Rainfall Crop Cropped.jpeg

Water For Life: India's Rainfall & Crop Analysis Through Visualizations

Overview

Proposal

Analysis Report

Poster

Application

 


Introduction

This study focuses on exploratory analysis of rainfall pattern, crop productivity and effect of rainfall pattern changes on crop productivity across different meteorological subdivisions of India. Considering a wide range of weather conditions across a vast geographic scale and varied topography, it won’t be wise to generalise climate changes and its effect on crop productivity in varied regions of India, that is why we decided to explore and visualise rainfall pattern changes and its effect on crop productivity for every meteorological subdivision.

Our research focuses on 34 out of 36 meteorological subdivisions rainfall and crop productivity. Two subdivision excluded are Lakshadweep and Andaman & Nicobar subdivisions. Rainfall data used for research is at meteorological subdivisions level however crop data used is at administrative district level. We have maintained same granularity level for visualisation of separate data. For analysing effect of rainfall over crop productivity, district data is aggregated to subdivision level based on mapping of subdivisions and districts.

Crop growing season in India is classified into two main seasons – (i) Kharif and (ii) Rabi based on monsoon. The Kharif cropping season if from July- Oct during south– west monsoon and Rabi cropping season is from October- March (Winter). Crop grown between March- June are Summer Crops. Apart from these seasonal crops there are few crops which are grown throughout year and are classified as Whole year crops. So, we have considered total Four seasons as Kharif, Rabi, Summer and Whole Year for analysis.

We have used multiple visualisation technics such as Time series, Heat Map, Tree Map, Parallel Coordinate Plot, Geo Facet and Bar charts for easy visualisation of varying rainfall pattern and changing crop productivity for 15 years across different subdivisions of India. R is used for data visualisation and application as it offers satisfactory set of inbuilt functions and libraries for both data mining and visualisation.


Objective And Motivation

Climate plays a significant role in economic development of India. Because large population of India depends on climate sensitive sectors like agriculture and forestry for livelihood. Climate change could lower the farmer’s income by up to 25% (Economic Survey 2018: http://mofapp.nic.in:8080/economicsurvey/pdf/082-101_Chapter_06_ENGLISH_Vol_01_2017-18.pdf).This is because agriculture in India is vulnerable to the vagaries of whether as close to 52% farm land is still unirrigated and depends on rainfall. This project is honest endeavor in gaining deeper knowledge into the impact of increasingly changing rainfall patterns, so that we can be prepared to mitigate the risk of these uncontrollable factors and seek remedies that would help sustain such drastic natural phenomenon.

Considering crops cultivation period and water requirement for crop during different stages of its lifecycle, it is important to analyse effect of monthly rainfall on crop productivity during cultivation period rather than simply considering yearly/seasonal average rainfall. Our objective is to provide single view to analyse monthly rainfall pattern changes, crop productivity changes and correlation between every month’s rainfall and crop productivity.


About The Data Source

Rainfall Data:

Rainfall data was obtained from “IITM Indian subdivision Monthly Rainfall data set” available on http://www.tropmet.res.in/. Data set consists of monthly, season wise and annual rainfall (in mm) for 36 meteorological subdivisions of India from year 1871 to 2016. Our research only used monthly data from year 2000 to 2014 for 34 subdivisions. This data set is chosen based on available crop production data.


Crop Production Data:

Crop Production data is obtained from “State-wise, season-wise crop production statistics from 1997” available on https://data.gov.in/. Data set contains season wise crop production data for 646 districts from 1997 to 2015 for 113 different crops. For this research, only those crops are selected for which districtwise data is available for more than 10 years. So final selected subset contains data for 56 crops for 595 districts. Crop productivity is calculated using below formula –

Left


Subdivision - District Mapping data:

List of meteorological subdivisions and districts covered under these subdivisions is obtained from India Meteorological Department’s website (http://www.imd.gov.in). This data is used to map subdivision wise rainfall data and district wise crop production data. Grid file for Geo facet graph is prepared based on geographic location of every subdivisions in India.


Critique of the Existing Visualizations

There has been wide range of analysis for India’s rainfall pattern changes and its effect on Crop productivity. One of these papers is “The Impact of Climate Change on Crop Yields in India from 1961 to 2010 by Aravind Moorthy, Wolfgang Buermann, and Deepak Rajagopal June 12, 2012(http://hpccc.gov.in/PDF/Agriculture/Climate%20Change%20and%20Crop%20Yields%20in%20India.pdf).

Most of these studies are focused on seasonal or yearly rainfall pattern and its effect on crop productivity. However, there has been limited analysis for monthly rainfall effect on crop productivity which is more important. Visualization provided is static and limited to line graphs as shown below:

Group4 image1.jpg




Dashboard Design and Visualization Methodology

1. Rainfall Analysis

Separate visualization for only rainfall analysis has been provided by Rainfall-geo facet and Rainfall-cyclic plot.


Visualization of Rainfall Time Trend through Geom_Facet plot

From Geo-facet plot monthly rainfall pattern for all subdivisions of India can be compared for selected year. Different subdivisions have different rainfall pattern based on geo-graphic location. South west coastal subdivisions and north east subdivisions observe wide rainfall season due monsoon wind pattern.

Group4 image2.jpg


Visualizing Variability of Monthly Rainfall through Cyclic Plot

From Cyclic plot rainfall monthly rainfall fluctuations for last 15 years can be visualized. Below cyclic plot is for Assam and Meghalaya subdivisions. Wide variation in rainfall in all the months has extreme effect on crop productivity.

Group4 image3.jpg


2. Crop Productivity Analysis

In this section, we analyze how crop production is distributed across various subdivisions and districts in India from year 2000 to 2014. As India’s crop cultivation is highly dependent on various seasons, we are also analyzing crop’s productivity over various seasons. In the crop data set, details for Crop’s total production and cultivation area are provided. Using these two fields we have calculated Crop Productivity as a ratio of crop’s production over cultivation area and used this parameter for our analysis.


Multivariate Analysis using Parallel Coordinates Plot

Parallel coordinate plot allows user to view high dimensional data with a facility to visualize categorical and numerical variables together. The plot is created using R parcoords package. Parallel coordinate plot shows how crop productivity is distributed for various crops over various season for particular subdivision. User can select one or multiple subdivisions as well as one or multiple years to make a fair comparison among different subdivisions over the years.


Group4 image4.jpg


Parallel coordinate plot gives user high level view of crop productivity distribution but cannot give very detailed level information. Assembly of Tree map and Bar plot is the very good visualization to give detailed crop productivity details at various hierarchical levels.


Hierarchical Data Visualization using Treemap and Barchart

As given crop data has a prominent subdivision and district hierarchy, Treemap is the first choice for visualization as treemap gives the hierarchical view of data to user and user can drill down or up in the data.

We have used treemap and d3treeR packages to create interactive treemap where user can hover and click on the subdivision to drill down and show various district inside that subdivision. Clicks from the treemap are used as input to bar plot which shows the high yielding crops arranged in descending order of their productivity for that subdivision or district.

Group4 image5a.jpg
Group4 image5b.jpg


3. Rainfall’s Effect on Crop Productivity

After exploring the rainfall and crop data individually for better understanding of the data, we advance further to analyze the interrelationship between precipitation and crop productivity, if there exists any. For our application, we especially focus on discovering correlation between monthly variations in the rainfall precipitation with crop productivity. We have created a separate page in our application for viewing these graphs. We created two visualizations to facilitate the user to gain deeper understanding of the interdependence of these variables.


Rainfall-Crop Productivity Patterns using Geo-Spatial and Diverging Lollipop Plots

The first tab on this page showcases geo-spatial visualization of India’s map on left side of the page and corresponding crop productivity values for subdivisions on right side of the page. Users can select the year and crop for configuring the graphs as per their requirement. We utilized leaflet package to plot an interactive chloropleth allowing the end user to view the distribution of annual rainfall precipitation across the 34 subdivisions of interest. For plotting the subdivisions, we obtained the shapefile to add those polygons to the map. The map also provides interactivity wherein on hover on individual subdivisions, a tooltip appears showing the value for the precipitation for that subdivision.

Group4 image6.jpg


To visualize crop productivity, a diverging lollipop plot was created using ggplot with ggplot_segment function. This plot displays crop productivity values above and below an average reference line of crop productivity. The user can instantly see which subdivisions have higher production (in Tonnes) compared to the cultivated land (in Hectares) in that subdivision for the selected crop.

Group4 image7.jpg


On the second tab, we create graphs to visualize the correlation between monthly rainfall and crop productivity. The application provides filters for subdivision and year to the user so that he/she can interactively select specific subdivisions/years for which he/she wishes to view the plots.

As correlation can be better viewed with a diverging scale ranging from negative to positive values, we used heatmap function form plotly package to create a correlation plot as shown. We used the viridis color scheme which is available by default with plotly package. This heatmap plots negative values as hues of dark blue color transitioning to the lighter yellow hue for positive values of the correlation. Such diverging color scheme provides instant insight on how crop productivity moves with respect to rainfall and which months have adverse effect on crop productivity

Group4 image8.jpg


We used plotly’s coupled functions for hover and click events to enable further interactivity by capturing this data and passing it onto another graph. As correlation is derived using 15 data points for respective months and crop productivity for selected year and subdivision, we provide an interactive plot to the user which displays those data points for monthly rainfall precipitation and crop productivity. We used bar chart to plot precipitation and line chart to plot crop productivity. This plot provides further granular view through which the correlation can be understood better.

Group4 image9.jpg


Online users would be able to find our application at the following website:

https://wiki.smu.edu.sg/1718t3isss608/Group04_Application



Key Insights


Conclusion/Future Work

Given time constraints and the nature of data we gathered, this application is only limited to show the correlation between crop productivity and rainfall pattern changes. We cannot conclude rainfall pattern change is the causation for crop productivity change in Indian agricultural sector, as there are several other factors impacting the cultivation and harvesting of various crops in different regions of India such as temperature, wind, soil as well as capital and government support.

This application can further be improved by including various details of aspects affecting agricultural sector in India, so that cause of crop production decline can be found out using various analytical techniques and further it can be used to predict the future crop production.


Acknowledgement

We would like to extend our gratitude towards Dr Kam Tin Seong (Singapore Management University) for his guidance on analytical techniques and R packages that may be used and feedback on visualisation techniques. Without his encouragement and technical assistance, this project would not be as it is today.


R Packages Used

We have used the following R packages to come up with our visualizations:

dplyr: A Grammar of Data Manipulation. It is a fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyr:It's designed specifically for data tidying (not general reshaping or aggregating) and works well with 'dplyr' data pipelines

reshape:Casts a molten data frame into the reshaped or aggregated form you want

readr :The goal of 'readr' is to provide a fast and friendly way to read rectangular data (like 'csv', 'tsv', and 'fwf'). It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes

ggplot:A system for 'declaratively' creating graphics. You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details

Plotly:Easily translate 'ggplot2' graphs to an interactive web-based version and/or create custom web-based visualizations directly from R

SunburstR:Make interactive 'd3.js' sequence sunburst diagrams in R with the convenience and infrastructure of an 'htmlwidget'.

Crosstalk:Provides building blocks for allowing HTML widgets to communicate with each other, with Shiny or without (i.e. static .html files)

Geofacet:Provides geofaceting functionality for 'ggplot2'. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that preserves some of the geographical orientation

rgdal:Bindings for the 'Geospatial' Data Abstraction Library

leaflet: Library to create Interactive Web Maps with the JavaScript 'Leaflet'

shiny: Web Application Framework for R

shinythemes: Themes for use with Shiny. Includes several Bootstrap themes

shinydashboard: Create dashboards with 'Shiny'. This package provides a theme on top of 'Shiny', making it easy to create attractive dashboards


References

[1] https://www.sciencedirect.com/science/article/pii/S2210600615300277

[2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.493.6215&rep=rep1&type=pdf

[3] http://iopscience.iop.org/article/10.1088/1755-1315/80/1/012067/pdf

[4] https://www.s-cool.co.uk/a-level/geography/agriculture/revise-it/factors-that-affect-the-distribution-of-agriculture

[5] http://astrostatistics.psu.edu/su06/inselberg061006.pdf

[6] https://plot.ly/r/

[7] https://biblioteca.ucm.es/BUCM/geo/doc22849.pdf

[8] https://www.bankexamstoday.com/2017/06/state-wise-list-of-crops-in-india-their.html

[9] https://books.google.com.sg/books?id=uEXA7WREvM4C&pg=PA74&lpg=PA74&dq=crop+production+for+36+meteorological+subdivisions+india&source=bl&ots=S3KNIgpfvL&sig=MjvamhPnFYIAMuZIsTju51koXqo&hl=en&sa=X&ved=0ahUKEwj_mYfVlqPcAhWWaCsKHSKDAiQQ6AEIPjAC#v=onepage&q&f=false

[10] https://rbi.org.in/Scripts/BS_ViewBulletin.aspx?Id=15564

[11] http://www.imdagrimet.gov.in/

[12] http://hydro.imd.gov.in/hydrometweb/(S(ji3no445rgyhxgenonkbfs55))/DistrictRaifall.aspx

[13] http://www.monsoondata.org/customize/