Difference between revisions of "Group18 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(16 intermediate revisions by 3 users not shown)
Line 69: Line 69:
  
 
<!--DATA CLEANING-->  
 
<!--DATA CLEANING-->  
 
 
  
 
<!--DESIGN FRAMEWORK-->  
 
<!--DESIGN FRAMEWORK-->  
Line 76: Line 74:
 
<font size = 5><span style="font-family:Century Gothic;">Design Framework</span></font>  
 
<font size = 5><span style="font-family:Century Gothic;">Design Framework</span></font>  
 
</div>
 
</div>
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">1. Interface:</span></font>
 
 
In designing the framework and visualization, we have followed through an iterative process of designing, development and visualization to include granular details with interactive features to make our application convey the truth hidden behind the data. We have included overview, zoom and filter details on demand in our application to add value to the end user. The user will be able to go through stage by stage analysis of time series, geospatial and relationship to understand the women crime occurrence in India.
 
In designing the framework and visualization, we have followed through an iterative process of designing, development and visualization to include granular details with interactive features to make our application convey the truth hidden behind the data. We have included overview, zoom and filter details on demand in our application to add value to the end user. The user will be able to go through stage by stage analysis of time series, geospatial and relationship to understand the women crime occurrence in India.
  
Line 82: Line 79:
  
 
Time series plots serves to provide the user with an overview of historical pattern of crime cases occurrences across different states and drill down further at district level for each selected state based on the crime type.  
 
Time series plots serves to provide the user with an overview of historical pattern of crime cases occurrences across different states and drill down further at district level for each selected state based on the crime type.  
 
5.1.1. Geofacet and Facets graphs
 
 
Geofacet package was leveraged to showcase changing trend in number of crime cases at state level for period of 14 years. It provides capability to arrange facet representing each state in its respective geographical positions. By selecting particular crime type and y – axis scaling type, user will be able to see crime cases trends for all states in a single view. Furthermore, the user can also select a particular state to view its changing trend pattern in detail in a separate interactive plot which display the data labels on hover.
 
 
        Figure : Geofacet Plot – State Level
 
 
To view crime pattern at district level, geofacet grid did not have any pre-defined district facet grid for Indian states. Alternate option available was to customize district facet positioning for each state manually which was very tedious as we had 640 districts in total to be arranged. Therefore, we used ggplot2 package to place each district facet adjacent to each other to show historical crime trends. The user can select the state and the crime type of interest to view district facets with respective trend.
 
 
 
Figure : Facet Plot – District Level
 
 
5.1.2. Slope graph
 
 
The slope graphs were used to show change over time between two fixed year, in our case, 2004 and 2014. These graphs would take selected crime type by the user to plot slope graph to observe whether there is a growing or declining trend across two period for each state. The ggplot2 package was exploited as it provides customizable slope graphs in terms of aesthetics and ease of usage. The data used for this plot had different range of value for each state and labels of some states cluttered towards the bottom as they shared similar values. To avoid this problem, data was transformed in a log scale before plotted using ggplot which allowed normalizing scale value and spacing out y – axis label of state names. A summary table is provided for the user to obtain actual value of crime occurrences in 2004 and 2014 for each reporting state along with slope graph.
 
 
 
Figure : Slope Graph
 
  
 
1.2 Geo- Spatial Analysis
 
1.2 Geo- Spatial Analysis
Line 106: Line 85:
  
 
As the absolute values of crimes will not represent the crime occurrence with respect to the country, a new layer parameter named Location Quotient has been derived. Location quotient (LQ) is a valuable way of quantifying how concentrated a particular crime is in a region as compared to the nation. The LQ calculated accounts for the crime occurrence with respect to the nation and also the population in the region with respect to the nation providing a relative scale statistical measure. LQ for each state for a specific crime is derived using the below formula:
 
As the absolute values of crimes will not represent the crime occurrence with respect to the country, a new layer parameter named Location Quotient has been derived. Location quotient (LQ) is a valuable way of quantifying how concentrated a particular crime is in a region as compared to the nation. The LQ calculated accounts for the crime occurrence with respect to the nation and also the population in the region with respect to the nation providing a relative scale statistical measure. LQ for each state for a specific crime is derived using the below formula:
 
 
   
 
   
 
 
Like the above LQ, the parameter is derived for the district level using the below formula to derive clearer picture of the crime occurrences to compare between the districts in the state.
 
Like the above LQ, the parameter is derived for the district level using the below formula to derive clearer picture of the crime occurrences to compare between the districts in the state.
 
 
   
 
   
 
 
The geo spatial plots represent both the absolute values of the crime and location quotient values in the Indian map as described in detail in the below sections. Using sf R package, the Indian spatial file was read using st_read function to import into R as a simple feature data frame.
 
The geo spatial plots represent both the absolute values of the crime and location quotient values in the Indian map as described in detail in the below sections. Using sf R package, the Indian spatial file was read using st_read function to import into R as a simple feature data frame.
  
5.2.1. Chloropleth Plot
+
1.1.1 Geofacet and Facets graphs
 
 
Chloropleth plot is a thematic map used to distinguish the regions with different shades in proportion to the statistical variable displayed on the map. In our case, we have considered the absolute values of crime and the location quotient as parameters in two chloropleth plot placed adjacent to each other in the district level. This is useful to visualize the absolute crime occurrence measure in region and the relative LQ measure in the region side by side.
 
 
 
The Indian district level data is merged with the crime data using the State and district as join keys. This is required as the district names are unique to each state in the country. In the case of location quotient, manual binning of 5 ranges were done to distinguish the regions with location quotients below and above of value 1. The maximum and minimum values were used to create two bins before and after location quotient value 1. Using the tmap package, the chloropleth for absolute crime values was created to have five bins using the quantile as style to shade the regions whereas for LQ the breaks were manually set to create better visualization.
 
 
 
Both the maps were rendered using the tmap and leaflet package in R allowing to have more interactivity features in the dashboard like zooming in and out, hovering over the district the crime occurrence and LQ values pops out respectively. By selecting particular state, crime type and year, user will be able to see distribution of the crime occurrences in the geospatial view for all the districts in the state for both absolute and LQ measures. The chloropleth plot was also created for the state level LQ values in the India map for the user selected crime type and year. These maps help to determine the crime hotspots against women for specific crime type in the state and national level.
 
 
 
Insert District Level chloropleth graph for Bihar rape 2014
 
 
 
 
 
 
 
 
 
 
 
  
 +
Geofacet package was leveraged to showcase changing trend in number of crime cases at state level for period of 14 years. It provides capability to arrange facet representing each state in its respective geographical positions. By selecting particular crime type and y – axis scaling type, user will be able to see crime cases trends for all states in a single view. Furthermore, the user can also select a particular state to view its changing trend pattern in detail in a separate interactive plot which display the data labels on hover.
 +
 +
To view crime pattern at district level, geofacet grid did not have any pre-defined district facet grid for Indian states. Alternate option available was to customize district facet positioning for each state manually which was very tedious as we had 640 districts in total to be arranged. Therefore, we used ggplot2 package to place each district facet adjacent to each other to show historical crime trends. The user can select the state and the crime type of interest to view district facets with respective trend.
 +
[[File: Timeseries.jpg|700px|centre]]
  
  
  
 +
1.1.2. Slope graph
  
 +
The slope graphs were used to show change over time between two fixed year, in our case, 2004 and 2014. These graphs would take selected crime type by the user to plot slope graph to observe whether there is a growing or declining trend across two period for each state. The ggplot2 package was exploited as it provides customizable slope graphs in terms of aesthetics and ease of usage. The data used for this plot had different range of value for each state and labels of some states cluttered towards the bottom as they shared similar values. To avoid this problem, data was transformed in a log scale before plotted using ggplot which allowed normalizing scale value and spacing out y – axis label of state names. A summary table is provided for the user to obtain actual value of crime occurrences in 2004 and 2014 for each reporting state along with slope graph.
 +
[[File: slope.PNG|500px|centre]]
  
 +
1.2.1. Chloropleth Plot
  
 +
Chloropleth plot is a thematic map used to distinguish the regions with different shades in proportion to the statistical variable displayed on the map. In our case, we have considered the absolute values of crime and the location quotient as parameters in two chloropleth plot placed adjacent to each other in the district level. This is useful to visualize the absolute crime occurrence measure in region and the relative LQ measure in the region side by side.
  
 +
The Indian district level data is merged with the crime data using the State and district as join keys. This is required as the district names are unique to each state in the country. In the case of location quotient, manual binning of 5 ranges were done to distinguish the regions with location quotients below and above of value 1. The maximum and minimum values were used to create two bins before and after location quotient value 1. Using the tmap package, the chloropleth for absolute crime values was created to have five bins using the quantile as style to shade the regions whereas for LQ the breaks were manually set to create better visualization.
 +
[[File: District Chloropleth.PNG|700px|centre]]
 +
Both the maps were rendered using the tmap and leaflet package in R allowing to have more interactivity features in the dashboard like zooming in and out, hovering over the district the crime occurrence and LQ values pops out respectively. By selecting particular state, crime type and year, user will be able to see distribution of the crime occurrences in the geospatial view for all the districts in the state for both absolute and LQ measures. The chloropleth plot was also created for the state level LQ values in the India map for the user selected crime type and year. These maps help to determine the crime hotspots against women for specific crime type in the state and national level.
  
  
 
+
1.2.2. Cartogram
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5.2.1. Cartogram
 
  
 
The reason for selecting the cartogram package in R is that cartogram represents a unique type of map as it combines statistical information with geographic location. The area cartogram uses a measurable variable to manipulate a place’s area to be sized accordingly. Cartogram visualizations is commonly used to portray geographic or social data like the human populations in the countries of the world.  
 
The reason for selecting the cartogram package in R is that cartogram represents a unique type of map as it combines statistical information with geographic location. The area cartogram uses a measurable variable to manipulate a place’s area to be sized accordingly. Cartogram visualizations is commonly used to portray geographic or social data like the human populations in the countries of the world.  
Line 158: Line 121:
 
Due to long processing time of cartogram variations, the shiny app has been built using the dorling cartogram variation. The plot gives the crime occurrences in a state by the size of the circle in the corresponding geographical location. The cartogram for the absolute crime value and chloropleth for the location quotient showcases the different kinds of geospatial plots in the state level. Both the plots have been placed next to each other and synchronized using the sync function under mapview package to hover over the specific location and derive insights on the absolute crime occurrence and location quotient values.
 
Due to long processing time of cartogram variations, the shiny app has been built using the dorling cartogram variation. The plot gives the crime occurrences in a state by the size of the circle in the corresponding geographical location. The cartogram for the absolute crime value and chloropleth for the location quotient showcases the different kinds of geospatial plots in the state level. Both the plots have been placed next to each other and synchronized using the sync function under mapview package to hover over the specific location and derive insights on the absolute crime occurrence and location quotient values.
  
Insert state level Rape 2014, chloropleth and cartogram side by side
+
[[File: India Chloropleth.jpg|700px|centre]]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The cartogram has three variations that can be developed using R namely continuous, non-contiguous and non-overlapping circles area cartograms.  The continuous cartogram is formed by specifying the iterations which makes it longer to process and render the plot. So, in the shiny web application the Dorling circle cartogram has been implemented as there are many combinations of inputs from user like crime type and year. To avoid the performance delay in the application to render the visualization each time, dorling is chosen.
 
The cartogram has three variations that can be developed using R namely continuous, non-contiguous and non-overlapping circles area cartograms.  The continuous cartogram is formed by specifying the iterations which makes it longer to process and render the plot. So, in the shiny web application the Dorling circle cartogram has been implemented as there are many combinations of inputs from user like crime type and year. To avoid the performance delay in the application to render the visualization each time, dorling is chosen.
  
 
    
 
    
Figure: Continuous and Non-contiguous cartograms
+
1.3 Relationships Analysis
  
+
1.3.1.1 Funnel Plots
      Figure: Non-overlapping circles (dorling) cartograms
 
 
 
5.3 Relationships Analysis
 
 
 
5.3.1. Funnel Plots
 
  
 
The funnel plot allows user to accurately detect variation in crime incidences across each state level foe selected crime type. It is a statistical method in the form of scatter plot to know pattern of crime type against total crime level at each state. Funnel R package was used to produce funnel plot for each crime type and total crime for each state. It takes number of crimes for a particular type and total number of crimes in each state as input to evaluate z -score and perform necessary plotting using confidence interval of 80% and 95% confidence interval.  New variable known as total number of women crimes in each state was created using Dplyr package. As the data for some states were found to be skewed, it was transformed to log scale before plotting. Ggplot2 was then utilized to improve aesthetics elements of funnel plot such label for those states above upper bound, repel the text of label from overlapping with each other.  
 
The funnel plot allows user to accurately detect variation in crime incidences across each state level foe selected crime type. It is a statistical method in the form of scatter plot to know pattern of crime type against total crime level at each state. Funnel R package was used to produce funnel plot for each crime type and total crime for each state. It takes number of crimes for a particular type and total number of crimes in each state as input to evaluate z -score and perform necessary plotting using confidence interval of 80% and 95% confidence interval.  New variable known as total number of women crimes in each state was created using Dplyr package. As the data for some states were found to be skewed, it was transformed to log scale before plotting. Ggplot2 was then utilized to improve aesthetics elements of funnel plot such label for those states above upper bound, repel the text of label from overlapping with each other.  
  
+
[[File: funnel.PNG|700px|centre]]
  
 
The user interfaces allow user to select year and crime type of interest to identify extreme outliers among states that are found above upper bound confidence interval of 95% line.
 
The user interfaces allow user to select year and crime type of interest to identify extreme outliers among states that are found above upper bound confidence interval of 95% line.
  
5.3.2. Chloropleth Plot  
+
1.3.1.2. Chloropleth Plot  
  
 
Chloropleth plot was used in addition to funnel plots to illustrate findings from funnel plot in geo- spatial form. The geo – spatial representation allows the user to view those outliers present above 95% confidence interval plotted and shaded in their respective geographical position along with its intensity. This choropleth can help the user to look for patterns / influence existing among outliers identified for each crime type in whole of nation.
 
Chloropleth plot was used in addition to funnel plots to illustrate findings from funnel plot in geo- spatial form. The geo – spatial representation allows the user to view those outliers present above 95% confidence interval plotted and shaded in their respective geographical position along with its intensity. This choropleth can help the user to look for patterns / influence existing among outliers identified for each crime type in whole of nation.
 
    
 
    
  
5.3.3. Clustered Heat Maps  
+
1.3.2. Clustered Heat Maps  
  
 
Clustered heat map shows variance across multiple variables, revealing any patterns, displaying whether any variables are similar to each other, and for detecting if any correlations exist in-between them. It was used to identify correlation between crime types such as rape, kidnapping, domestic violence, dowry death, total crime, literacy rate and sex ratio across different states in India. Heatmaply package offers user friendly interactive cluster heatmap with tooltip display of values when hovering over cells, as well as the ability to zoom in to specific sections of the figure from the data matrix, the side dendrograms, or annotated labels. The user can calibrate the clustered heat maps by selecting year, number of clusters, type of data transformation, hierachical clustering algorithm to visualize how external factors contribute to crime occurences across different states. The plot also allows user to view clusters formed among states with similar patterns in terms of crime type and external factors for a selected year. A summary table provided for the user to look for actual value of each of crime type and external factors used for this analysis.
 
Clustered heat map shows variance across multiple variables, revealing any patterns, displaying whether any variables are similar to each other, and for detecting if any correlations exist in-between them. It was used to identify correlation between crime types such as rape, kidnapping, domestic violence, dowry death, total crime, literacy rate and sex ratio across different states in India. Heatmaply package offers user friendly interactive cluster heatmap with tooltip display of values when hovering over cells, as well as the ability to zoom in to specific sections of the figure from the data matrix, the side dendrograms, or annotated labels. The user can calibrate the clustered heat maps by selecting year, number of clusters, type of data transformation, hierachical clustering algorithm to visualize how external factors contribute to crime occurences across different states. The plot also allows user to view clusters formed among states with similar patterns in terms of crime type and external factors for a selected year. A summary table provided for the user to look for actual value of each of crime type and external factors used for this analysis.
  
<!--Choropleth Map-->
+
[[File: heatmap.PNG|700px|centre]]
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px;">2.1. Choropleth World Map:</span></font>
 
 
 
<div style="padding-top:3px;">
 
[[File:ADA Report 2.JPG |1000px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 8''</u> </center>
 
</div>
 
 
 
A choropleth world map is used to depict temperatures worldwide on a yearly basis. Darker colours represent countries with higher temperatures as compared to countries with lighter coloured countries in this map. The map above shows average temperatures for 86 countries in 1990. While this visualization allows a quick overview of countries facing high/low temperatures, it does not show the trend in changing temperatures over the years.
 
 
 
<b><i>Features of the Choropleth world map:</i></b>
 
 
 
* Heat map of the world showing average annual temperature for all of the countries for each year
 
 
 
* The colours here represent the range temperature in degrees Celsius
 
 
 
* The countries that are grey here are countries that are not in the current scope of this application
 
 
 
* By clicking on each country, we can see the name of each country and the average temperature for the chosen year
 
 
 
* Using the slider at the bottom, we can drag it across to see how temperatures have changed for the countries over time
 
 
 
* Gives us an overall view of temperatures for each year for the world as a whole
 
<!--Choropleth Map-->
 
 
 
 
 
 
 
<!--Geofacets-->
 
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px; ">2.2. Geofacet Maps:</span></font>
 
 
 
<div style="padding-top:3px;">
 
[[File:ADA Report 3.JPG |1000px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 9''</u> </center>
 
</div>
 
 
 
An alternative visualization using R’s latest Geofacet package has been made use of to portray continent level analysis. For each continent, this package plots countries in their respective geographical positions and also shows trends in changing temperatures over the years. With this visualisation we have taken the concept of geographical representation of temperature data to a new level using R’s relatively new geofacet package.
 
 
 
<b><i>Features of the Geofacet maps:</i></b>
 
* Each of the faceted plots are spatially placed in a grid in locations respective to the actual geographic locations of each country on a world map
 
 
 
* The lines represent the change in temperature over the years
 
 
 
* From 1990 to 2012, which is the first part of the blue line, shows the actual temperature values
 
 
 
* The envelope represents forecasted temperature values with 95% confidence interval for the subsequent 10 years for each country
 
 
 
* Next to each of the geofacet maps, a data table is shown which highlights further information on temperature values for each country with the minimum and maximum temperature. The purpose of the table is to show the variation of temperatures over time in the temperature deviation column
 
 
 
* This dashboard allows us to gain insight into the change in temperatures for countries in a continent relative to each other
 
<!--Geofacets-->
 
 
 
 
 
 
 
<!--Parallel Coordinate Plots-->
 
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px; ">2.3. Parallel Coordinates Plot:</span></font>
 
 
 
<div style="padding-top:3px;">
 
[[File:ADA Report 4.JPG |1000px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 10''</u> </center>
 
</div>
 
 
 
A parallel plot to perform country level analysis by consolidating all factors in one visualization. This chart allows all for comparison of all values in one view. The plot facilitates interactive highlighting to visually analyze and compare variables. The user is provided with additional functionality of being able to drag variables and re-arrange them in the order they prefer.
 
 
 
<b><i>Features of the Geofacet maps:</i></b>
 
* Each factor mapped to countries with a line
 
 
 
* The ability to highlight a subset of one of the variable axes, for example, highlighting countries with high temperature to see if any trends can be seen across the other variables
 
 
 
* Use of a slider at the bottom to see how the values for countries changed over time
 
 
 
* The benefit of this visualisation is the ability to see all the variables in one place, determining any trends/relationships can be seen across variables
 
 
 
* The ability to highlight certain areas of the axes mean that we can focus on our key areas of concern.
 
<!--Parallel Coordinates Plot-->
 
 
 
 
 
 
 
<!--Tree Map & Bubble Plot-->
 
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px; ">2.4. Treemap & Bubble Plot:</span></font>
 
 
 
<div style="padding-top:3px;">
 
[[File:ADA Report 5.JPG |1000px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 11''</u> </center>
 
</div>
 
 
 
 
 
This tab in the dashboard portrays two visualizations along each other to show country-level analysis. The tree map plot on the left represents average annual temperature in a geographically hierarchical manner. Once we click on a continent we can see the different temperatures across the countries in the continent. On the right, is the bubble plot representing four different factors.
 
 
 
<b><i>Features of the Geofacet maps:</i></b>
 
 
 
* On the x axis we have our influencers, these tabs across the top allow you to change the x axis based on your area of interest
 
 
 
* The y axis represents temperature across all graphs
 
 
 
* The size of each bubble represents the emissions for each of the countries
 
 
 
* The colour of each bubble represents the country itself
 
 
 
The main advantage of this dashboard is the ability to visually see the relationship between the influencers, the emission, and our effect, which is temperature – this was the key aim in building this app. This also helps to set us aside from other visualisations out there, while you drill down further into the actual temperature of the country, you can also quickly get an idea of where that country lies in terms of the consumption/usage of the influencers and their emissions.
 
<!--Tree Map & Bubble Plot-->
 
 
 
<!--DESIGN FRAMEWORK-->
 
 
 
<!--DEMONSTRATION: USE CASES-->
 
<div style="text-align:center;vertical-align:bottom;">
 
<font size = 5><span style="font-family:Century Gothic;">Demonstration: Sample Use Cases</span></font>
 
</div>
 
 
 
The purpose of this section is to provide important demonstrative examples of the usage of this application. There are many possible use cases depending on the areas of interest/concerns of the targeted audience. However, some important sample use cases for this application are as follows.
 
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">1. Comparison of Europe and Africa using Geofacet Maps:</span></font>
 
 
 
Rising temperatures are seen for countries worldwide even though the rate of change is geographically inconsistent. For example, for some countries in Africa, the increase in temperature is more rapid than for countries in Europe.
 
 
 
<div style="float:left;width:50%">
 
[[Image:ADA Report 6.JPG|500px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 12''</u> </center>
 
</div>
 
<div style="float:left;width:50%">
 
[[Image:ADA Report 7.JPG|570px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 13''</u> </center>
 
</div>
 
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">2. Country-level analysis for China using Bubble Plot</span></font>
 
 
 
Developing countries such as China has a consistent increase in consumption of electricity and fossil fuels. China has also seen an exponential increase in greenhouse gas emissions (currently the largest contributor to greenhouse gases)
 
 
 
<div style="float:left;width:50%">
 
[[Image:ADA Report 8.JPG|530px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 14''</u> </center>
 
</div>
 
<div style="float:left;width:50%">
 
[[Image:ADA Report 9.JPG|550px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 15''</u> </center>
 
</div>
 
 
 
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">3. Drawing relationships between variables in Asian Countries</span></font>
 
 
 
Countries in the continent of Asia have high temperatures but low electricity consumption, comparatively high renewable energy adoption. The user is provided with additional functionality of being able to drag variables and re-arrange them in the order they prefer.
 
 
 
<div style="padding-top:3px;">
 
[[File:ADA Report 10.JPG |1000px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 16''</u> </center>
 
</div>
 
<!--DEMONSTRATION: USE CASES-->
 
 
 
 
 
 
 
<!--DISCUSSION-->
 
<div style="text-align:center;vertical-align:bottom;">
 
<font size = 5><span style="font-family:Century Gothic;">Discussion</span></font>
 
</div>
 
 
 
Climate change has many aspects to it, rise in sea level, melting ice caps, erratic rainfall, but the one factor everyone thinks of is “rise in temperatures”. This is the focus of our application. When we look at existing visualisations related to global warming, be it static visualisations or interactive, we find focus on individual environmental hazards such as increased rates of carbon emissions or the rapid rise in temperature, our analysis attempts to connect the dots to better understand the cause-and-effect nature of global warming. Through our visualisations, we depict the causal effect between the factors which contribute to greenhouse gas emissions and the resulting impact on increase in temperature from the year 1990 to 2012. Furthermore, we attempt to forecast the causal effects and the net rise in temperature for ten subsequent years to better understand the variation in each factor over time.
 
 
 
Our application that we have named as GRIT (Global Rise in Temperature) is a consolidated application that allow us to map this rise in temperature to its causes and observe patterns on world, continent, and country level.
 
 
 
The main advantage of this dashboard is the ability to visually see the relationship between the influencers, the emission, and our effect, which is temperature – this was the key aim in building this app. This also helps to set us aside from other visualizations, while you drill down further into the actual temperature of the country, you can also quickly get an idea of where that country lies in terms of the consumption/usage of the influencers and their emissions.
 
<!--DISCUSSION-->
 
 
 
 
 
 
<!--FUTURE SCOPE-->
 
<!--FUTURE SCOPE-->
 
<div style="text-align:center;vertical-align:bottom;">  
 
<div style="text-align:center;vertical-align:bottom;">  
 
<font size = 5><span style="font-family:Century Gothic;">Future Scope</span></font>  
 
<font size = 5><span style="font-family:Century Gothic;">Future Scope</span></font>  
</div>
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">1. Additional Functions:</span></font>
 
 
With the foundation of the application been created, following is a list of <b>additional functions</b> that can be added:
 
 
* Adding in more countries
 
 
* Adding in more years for improving forecasting results and more up to date actual data
 
 
* The app can be extended to a city/state level. While the city level data was available for the temperature dataset, it was not available for the world bank data sets at this time
 
 
* What if analysis in the context of different eco-friendly strategies that countries are beginning to implement to combat the impact of global warming
 
 
* Increase interactivity across the tabs in the application to allow more seamless application experience
 
 
* Allowing users to upload their own datasets in the future would allow them to consider other influencer/emission/output factors for exploration
 
 
* Along with forecasting we would also like to create a prediction model that draws relationships between the influencers and the corresponding effect on temperature individually
 
 
  
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">2. Real World Use Cases</span></font>  
+
<div style="text-align:left;vertical-align:bottom;padding-top:5px;">  
  
In the context of the complete application including the future functionalities, the application can be used by:
+
The application currently has dorling variation of cartogram built in due to performance issues. Future attempts can be made to include continuous cartogram which provides better representation of absolute crime occurrences across different Indian states.
 +
At present, external social data such as literacy rate and sex ratio is included to identify correlation with different crime types against women. Additional social – economic factors can be supplemented incorporated to get a wider view of external factors influence on crime against women.
 +
Availability of recent statistics about women crime for 2015 – 2017 from National Crime Records Bureau can be integrated with existing data to get visualization regarding recent situation in the country.
 +
Application can be enhanced to allow clusters from heat maps to be plotted into geo – spatial map to derive better insights from geographical topography point of view.
  
* Climate/temperature analysts as this is a growing concern anyway
+
</div>
 
 
* It can be used for educating people about environmental awareness, especially the global warming naysayers
 
 
 
* This can be used by governments to plan their environment budgets based on forecasted values
 
 
<!--FUTURE SCOPE-->
 
<!--FUTURE SCOPE-->
  
 +
<!--REFERENCES-->
 +
<div style="text-align:center;vertical-align:bottom;padding-top:5px;">
 +
<font size = 5><span style="font-family:Century Gothic;">References</span></font>
 +
</div>
  
<!--INSTALLATION GUIDE-->
+
<div style="text-align:left;vertical-align:bottom;padding-top:5px;">  
<div style="text-align:center;vertical-align:bottom;">
+
[1] Arvind Verma (2016). Exploring the trend of violence against women in India. International journal of comparative and applied criminal justice, 41(1), 3-18.<br>
<font size = 5><span style="font-family:Century Gothic;">Installation Guide</span></font>  
+
[2]  Amarantha Donna Ropmay (2014). Crimes against Women in Matrilineal Meghalaya A Forensic Medical Perspective. Journal of Indian academic forensic medicine 36(4).<br>
</div>
+
[3] Douglas C Dover and Donald P Schopflocher (2011). Using funnel plots in public health surveillance. Population Health Metrics, 9, 58.<br>
 
+
[4] Brian Houle, James Holt, et al., (2009). Use of Density-Equalizing Cartograms to Visualize Trends and Disparities in State-Specific Prevalence of Obesity: 1996–2006. Americal journal of public health 99(2), 308 – 312.<br>
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">1. Explore the Live Application:</span></font>
+
[5]  Zheng Tian (2009). Measuring Agglomeration Using the Standardized Location Quotient with a Bootstrap Method. The journal of regional analysis and policy, 43(2), 186-197.<br>
 
+
[6]  Shilin Zhao, Yan Guo, et al., (2014). Advanced Heat Map and Clustering Analysis Using Heatmap3. BioMed Research International. 42(1), 186-192.<br>
As an end user of the application, you can explore the GRIT application to perform your own analysis of the rise in global temperatures.
+
[7] Thomas P. A. Debray, Karel G. M. Moons, et al., (2018). Detecting small‐study effects and funnel plot asymmetry in meta‐analysis of survival data: A comparison of new and existing tests. Research synthesis methods, 9(1), 41-50.<br>
 
+
[8] https://www.r-bloggers.com/revisiting-crimes-against-women-in-india/. <br>
[https://angad-sr.shinyapps.io/isss608-group_2-grit/ Click here] to explore the live application.
+
</div>
 
+
<!--References-->
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">2. Installation Process:</span></font>
 
 
 
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px; ">2.1. System Requirements</span></font>
 
 
 
Your local system should have R Studio installed. Due to the packages used in this application, the minimum version for RStudio is 'Version 1.0.143'.
 
 
 
[https://www.rstudio.com/products/rstudio/download/ Click here] to download the latest version of RStudio.
 
 
 
 
 
<font size = 3><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px;padding-left:4px; ">2.2. Download the Source Code</span></font>
 
<div style="float:right; width:15%;display:block;">
 
[[Image:ADA Report 11.JPG|150px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 17''</u> </center>
 
</div>
 
For R enthusiasts and fellow-coders who are interested in downloading the source code of the application, all relevant code files for the GRIT application are available on <b>GitHub</b>.
 
 
 
[https://github.com/angad-sr/ISSS608-Group-2---Visual-Analytics-Project Click here] to view and download the application code files.
 
 
 
After downloading the source files, open the <b>app.R</b> file in RStudio to explore the source code. Click the <i>Run App</i> button to run the application on your local machine.
 
 
 
 
 
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">3. Deployment Process</span></font>
 
 
 
After downloading the Source files as explained in the previous section, you can host the R Shiny application on your own server. A free server to host this application is provided by shinyapps.io. Steps to deploy the application on shinyapps.io are as follows:
 
 
 
* [http://www.shinyapps.io/ Visit Shiny Apps] and sign up for a free account, which allows you to host up to 5 applications.
 
 
 
* Based on the personal token and secret number provided by Shiny Apps after signing up, the application can then be deployed to your server.
 
 
 
* In Rstudio, execute the following code snippet:
 
<div style="padding-left:5%;"><i>
 
install.packages('devtools')
 
 
 
devtools::install_github('rstudio/shinyapps')
 
 
 
devtools::install_github('rstudio/rsconnect')
 
 
 
library(shinyapps)
 
 
 
library(rsconnect)
 
 
 
rsconnect::setAccountInfo(name="Your account name", token="Your Personal Token", secret="Your Secret Number")</i></div>
 
 
 
<div style="float:right; width:15%;display:block;">
 
[[Image:ADA Report 12.JPG|150px|center|border]]<br/>
 
<center style="font-size:13px;"> <u>''Figure 18''</u> </center>
 
</div>
 
 
 
 
 
* Click on the Publish icon in RStudio to upload and deploy the application to the Shiny Apps Server.
 
* After entering an appropriate name for your application, the GRIT application will be deployed and hosted to your Shiny Apps server.
 
<!--INSTALLATION GUIDE-->
 
 
 
 
 
<!--USER GUIDE-->
 
<div style="text-align:center;vertical-align:bottom;">
 
<font size = 5><span style="font-family:Century Gothic;">User Guide</span></font>
 
</div>
 
 
 
Please refer to the User Guide file below for a brief overview on how to use the GRIT application:
 
 
 
[[File: GRIT - User Guide.pdf|GRIT: User Guide ]]
 
 
 
 
 
<!--REFERENCES-->
 
<div style="text-align:center;vertical-align:bottom;">  
 
<font size = 5><span style="font-family:Century Gothic;">References</span></font>
 
</div>
 
*[http://data.worldbank.org/indicator/EG.USE.COMM.FO.ZS?view=chart| Fossil fuel energy consumption (% of total)]
 
*[http://data.worldbank.org/indicator/AG.LND.FRST.K2?view=chart| Forest area (sq. km)]
 
*[http://data.worldbank.org/indicator/EG.USE.ELEC.KH.PC?view=chart| Electric power consumption (kWh per capita)]
 
*[http://data.worldbank.org/indicator/EG.FEC.RNEW.ZS?view=chart| Renewable energy consumption (% of total final energy consumption)]
 
*[http://data.worldbank.org/indicator/SP.POP.TOTL?view=chart| Population, total]
 
*[http://data.worldbank.org/indicator/EN.ATM.GHGT.KT.CE?view=chart| Total greenhouse gas emissions (kt of CO2 equivalent)]
 
*[https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data| Climate Change: Earth Surface Temperature Data]
 
*[https://stackoverflow.com/questions/17514648/how-do-i-name-the-row-names-column-in-r| Stack overflow: How do I name the row names column in r]
 
*[https://rdrr.io/cran/data.table/man/na.omit.data.table.html| Removing rows with missing values]
 
*[https://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix-long-to-wide-format| Stack overflow: Reshape three column data frame to matrix (long to wide format)]
 
*[https://stackoverflow.com/questions/28782654/r-finding-unmatched-column-names-of-data-frames| Stack overflow: R-finding unmatched column names of data frames]
 
*[https://stackoverflow.com/questions/4350440/split-a-column-of-a-data-frame-to-multiple-columns| Stack overflow: Split a column of a data frame to multiple columns]
 
*[https://stackoverflow.com/questions/6289538/aggregate-a-dataframe-on-a-given-column-and-display-another-column| Stack overflow: Aggregate a dataframe on a given column and display another column]
 
*[https://stackoverflow.com/questions/34591329/remove-white-space-from-a-data-frame-column-and-add-path| Stack overflow: Remove white space from a data frame column and add path]
 
*[https://stackoverflow.com/questions/21641522/how-to-remove-specific-special-characters-in-r| Stack overflow: How to remove specific special characters in R]
 
*[http://r.789695.n4.nabble.com/concatenating-2-text-columns-in-a-data-frame-td881819.html| Concatenating 2 text columns in a data.frame]
 
*[https://stackoverflow.com/questions/26896971/add-space-between-two-letters-in-a-string-in-r| Stack overflow: Add space between two letters in a string in R]
 
*[https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv| ISO 3166 Countries with Regional Codes]
 
*[https://www.rforexcelusers.com/vlookup-in-r/| How to do VLOOKUP in R]
 
*[https://stackoverflow.com/questions/25045496/r-counting-the-number-of-matches-between-multiple-data-frames| Stack overflow: R: Counting the number of matches between multiple data frames]
 
*[https://hafen.github.io/geofacet/| Geofacet]
 
*[http://www.buildingwidgets.com/blog/2015/1/30/week-04-interactive-parallel-coordinates-1| Interactive Parallel Coordinates]
 
*[http://www.r-graph-gallery.com/237-interactive-treemap/| Interactive Treemap]
 
<!--REFERENCES-->
 

Latest revision as of 11:45, 16 August 2018

A sanctuary for women – Is there one?


OVERVIEW

PROPOSAL

REPORT

POSTER

APPLICATION

BACK

 

Motivation

Women in India faces the challenge to live safely in the democratic country from the time of birth. Cultural difference and peculiarities, male domination, skewed sex ratio, age-old customs like Sati and Dowry and the lower status women hold in society leave women at higher risk to become victims of violence. The myth has been created across the world that India is unsafe for women and travel, through our research we would like to analyse the situation and crime hotspots in the state and district levels. Through our research

we believe to provide a holistic perspective of crimes against women in India using time series, geo-spatial and relationship analytics for better insights on crime occurrence in India. This research aims to capture the following analysis:

a) Create a user friendly and interactive visualization platform for data exploration and trend analysis of the Women crime pattern over the years 2001 -2014 in the state and district level. b) Visualize the geospatial view of the number of incidents and location quotient variations within a state & district level for a specific year and crime in India. c) Analyse the relationship plot of specific crime, total crime against women and social factors influence in the states of the country.


Review and Critique of Prior Work

Tinniam V Ganesh (2015) developed shiny app to visualize crime against women in India for the period of 2001 to 2012 using chloropleth map and linear model to project future incidences of crime in each state. The app allows the user to select year and crime type for chloropleth map. Similarly, Open government data (OGD) platform India also used chloropleth map to visualize total crime against women in each state for the year 2013. The areas in chloropleth map were shaded in proportion using absolute count of crimes in each state for the selected year. Using absolute value of crime for shading undermines the truth about crime incidences in the region with respective to its actual population distribution. It is possible for a state with less population to have less number of crime incidences. Using a relative proportion of crime in a state to its actual population would give more intuitive information in understanding crime pattern across different states in India. Our work tries to overcome this issue by using a cartogram plot and location quotient which considers relative proportion of crime to population in each state to shade areas in the map


Data Cleaning, Preparation and Modeling

Indian Women Crime dataset was obtained from the National Crime Records Bureau (NCRB), Govt of India official website. The Indian census data of year 2011 is supplemented with crime data to provide holistic view of relationship between crime and social factors at district level across the country. The shape files for India has been taken from GADM website with administrative layer 1 and 2 for states and districts respectively.

The complete dataset consists of 8629 observations of women crime occurrences recorded in 29 States, 7 Union territories across 640 districts in India between the year 2001 to year 2014. Seven types of women crime recorded uniformly across all the years was used to prepare the full dataset. The Indian Census data was used to obtain six new variables namely Population, Male Population, Female Population, Literacy, Male Literacy and Female literacy across the 640 districts in the country.

Due to the different representation of the district names in both the datasets, manual renaming of the district names in crime data was performed before meaningful analysis or visualization could be carried out. The individual state and district names were matched to the respective columns in the crime data to obtain the external factors values for each district. The data was aggregated in the state and district level in the R shiny computing environment to suit the needs of the data analysis.


Design Framework

In designing the framework and visualization, we have followed through an iterative process of designing, development and visualization to include granular details with interactive features to make our application convey the truth hidden behind the data. We have included overview, zoom and filter details on demand in our application to add value to the end user. The user will be able to go through stage by stage analysis of time series, geospatial and relationship to understand the women crime occurrence in India.

1.1 Time Series Analysis

Time series plots serves to provide the user with an overview of historical pattern of crime cases occurrences across different states and drill down further at district level for each selected state based on the crime type.

1.2 Geo- Spatial Analysis

This section of the analysis focusses on the crime cases distribution in different states and districts of India through Chloropleth and Cartogram visualizations. The GADM Indian shape file was taken for plotting the boundaries for the country using two layers: administrative layer 1 for states and layer 2 for districts. The state names which have mismatch have been recoded in R according to the shape file for accurate matching by the state name.

As the absolute values of crimes will not represent the crime occurrence with respect to the country, a new layer parameter named Location Quotient has been derived. Location quotient (LQ) is a valuable way of quantifying how concentrated a particular crime is in a region as compared to the nation. The LQ calculated accounts for the crime occurrence with respect to the nation and also the population in the region with respect to the nation providing a relative scale statistical measure. LQ for each state for a specific crime is derived using the below formula:

Like the above LQ, the parameter is derived for the district level using the below formula to derive clearer picture of the crime occurrences to compare between the districts in the state.

The geo spatial plots represent both the absolute values of the crime and location quotient values in the Indian map as described in detail in the below sections. Using sf R package, the Indian spatial file was read using st_read function to import into R as a simple feature data frame.

1.1.1 Geofacet and Facets graphs

Geofacet package was leveraged to showcase changing trend in number of crime cases at state level for period of 14 years. It provides capability to arrange facet representing each state in its respective geographical positions. By selecting particular crime type and y – axis scaling type, user will be able to see crime cases trends for all states in a single view. Furthermore, the user can also select a particular state to view its changing trend pattern in detail in a separate interactive plot which display the data labels on hover.

To view crime pattern at district level, geofacet grid did not have any pre-defined district facet grid for Indian states. Alternate option available was to customize district facet positioning for each state manually which was very tedious as we had 640 districts in total to be arranged. Therefore, we used ggplot2 package to place each district facet adjacent to each other to show historical crime trends. The user can select the state and the crime type of interest to view district facets with respective trend.

Timeseries.jpg


1.1.2. Slope graph

The slope graphs were used to show change over time between two fixed year, in our case, 2004 and 2014. These graphs would take selected crime type by the user to plot slope graph to observe whether there is a growing or declining trend across two period for each state. The ggplot2 package was exploited as it provides customizable slope graphs in terms of aesthetics and ease of usage. The data used for this plot had different range of value for each state and labels of some states cluttered towards the bottom as they shared similar values. To avoid this problem, data was transformed in a log scale before plotted using ggplot which allowed normalizing scale value and spacing out y – axis label of state names. A summary table is provided for the user to obtain actual value of crime occurrences in 2004 and 2014 for each reporting state along with slope graph.

Slope.PNG

1.2.1. Chloropleth Plot

Chloropleth plot is a thematic map used to distinguish the regions with different shades in proportion to the statistical variable displayed on the map. In our case, we have considered the absolute values of crime and the location quotient as parameters in two chloropleth plot placed adjacent to each other in the district level. This is useful to visualize the absolute crime occurrence measure in region and the relative LQ measure in the region side by side.

The Indian district level data is merged with the crime data using the State and district as join keys. This is required as the district names are unique to each state in the country. In the case of location quotient, manual binning of 5 ranges were done to distinguish the regions with location quotients below and above of value 1. The maximum and minimum values were used to create two bins before and after location quotient value 1. Using the tmap package, the chloropleth for absolute crime values was created to have five bins using the quantile as style to shade the regions whereas for LQ the breaks were manually set to create better visualization.

District Chloropleth.PNG

Both the maps were rendered using the tmap and leaflet package in R allowing to have more interactivity features in the dashboard like zooming in and out, hovering over the district the crime occurrence and LQ values pops out respectively. By selecting particular state, crime type and year, user will be able to see distribution of the crime occurrences in the geospatial view for all the districts in the state for both absolute and LQ measures. The chloropleth plot was also created for the state level LQ values in the India map for the user selected crime type and year. These maps help to determine the crime hotspots against women for specific crime type in the state and national level.


1.2.2. Cartogram

The reason for selecting the cartogram package in R is that cartogram represents a unique type of map as it combines statistical information with geographic location. The area cartogram uses a measurable variable to manipulate a place’s area to be sized accordingly. Cartogram visualizations is commonly used to portray geographic or social data like the human populations in the countries of the world.

As in our research, the chloropleth plot showcases the crime occurrences using shading in the district level for absolute and LQ values, a better visualization is created using cartogram to take into account geographical location and crime occurrence. The shape file is read using the read_OGR function to create a spatial dataframe object using the rgdal package. It is merged with the crime data and converted into Spatial Polygon dataframe for map projection using the sp_transform function in R. Based on the user selected crime type and year, the cartogram map is plotted in the state level to represent the absolute crime values in the Indian map. Different variations of cartogram namely continuous, non-contiguous and non-overlapping circles(dorling) cartogram have been created using the cartogram with the tmap package.

Due to long processing time of cartogram variations, the shiny app has been built using the dorling cartogram variation. The plot gives the crime occurrences in a state by the size of the circle in the corresponding geographical location. The cartogram for the absolute crime value and chloropleth for the location quotient showcases the different kinds of geospatial plots in the state level. Both the plots have been placed next to each other and synchronized using the sync function under mapview package to hover over the specific location and derive insights on the absolute crime occurrence and location quotient values.

India Chloropleth.jpg

The cartogram has three variations that can be developed using R namely continuous, non-contiguous and non-overlapping circles area cartograms. The continuous cartogram is formed by specifying the iterations which makes it longer to process and render the plot. So, in the shiny web application the Dorling circle cartogram has been implemented as there are many combinations of inputs from user like crime type and year. To avoid the performance delay in the application to render the visualization each time, dorling is chosen.


1.3 Relationships Analysis

1.3.1.1 Funnel Plots

The funnel plot allows user to accurately detect variation in crime incidences across each state level foe selected crime type. It is a statistical method in the form of scatter plot to know pattern of crime type against total crime level at each state. Funnel R package was used to produce funnel plot for each crime type and total crime for each state. It takes number of crimes for a particular type and total number of crimes in each state as input to evaluate z -score and perform necessary plotting using confidence interval of 80% and 95% confidence interval. New variable known as total number of women crimes in each state was created using Dplyr package. As the data for some states were found to be skewed, it was transformed to log scale before plotting. Ggplot2 was then utilized to improve aesthetics elements of funnel plot such label for those states above upper bound, repel the text of label from overlapping with each other.

Funnel.PNG

The user interfaces allow user to select year and crime type of interest to identify extreme outliers among states that are found above upper bound confidence interval of 95% line.

1.3.1.2. Chloropleth Plot

Chloropleth plot was used in addition to funnel plots to illustrate findings from funnel plot in geo- spatial form. The geo – spatial representation allows the user to view those outliers present above 95% confidence interval plotted and shaded in their respective geographical position along with its intensity. This choropleth can help the user to look for patterns / influence existing among outliers identified for each crime type in whole of nation.


1.3.2. Clustered Heat Maps

Clustered heat map shows variance across multiple variables, revealing any patterns, displaying whether any variables are similar to each other, and for detecting if any correlations exist in-between them. It was used to identify correlation between crime types such as rape, kidnapping, domestic violence, dowry death, total crime, literacy rate and sex ratio across different states in India. Heatmaply package offers user friendly interactive cluster heatmap with tooltip display of values when hovering over cells, as well as the ability to zoom in to specific sections of the figure from the data matrix, the side dendrograms, or annotated labels. The user can calibrate the clustered heat maps by selecting year, number of clusters, type of data transformation, hierachical clustering algorithm to visualize how external factors contribute to crime occurences across different states. The plot also allows user to view clusters formed among states with similar patterns in terms of crime type and external factors for a selected year. A summary table provided for the user to look for actual value of each of crime type and external factors used for this analysis.

Heatmap.PNG

Future Scope

The application currently has dorling variation of cartogram built in due to performance issues. Future attempts can be made to include continuous cartogram which provides better representation of absolute crime occurrences across different Indian states. At present, external social data such as literacy rate and sex ratio is included to identify correlation with different crime types against women. Additional social – economic factors can be supplemented incorporated to get a wider view of external factors influence on crime against women. Availability of recent statistics about women crime for 2015 – 2017 from National Crime Records Bureau can be integrated with existing data to get visualization regarding recent situation in the country. Application can be enhanced to allow clusters from heat maps to be plotted into geo – spatial map to derive better insights from geographical topography point of view.

References

[1] Arvind Verma (2016). Exploring the trend of violence against women in India. International journal of comparative and applied criminal justice, 41(1), 3-18.
[2] Amarantha Donna Ropmay (2014). Crimes against Women in Matrilineal Meghalaya A Forensic Medical Perspective. Journal of Indian academic forensic medicine 36(4).
[3] Douglas C Dover and Donald P Schopflocher (2011). Using funnel plots in public health surveillance. Population Health Metrics, 9, 58.
[4] Brian Houle, James Holt, et al., (2009). Use of Density-Equalizing Cartograms to Visualize Trends and Disparities in State-Specific Prevalence of Obesity: 1996–2006. Americal journal of public health 99(2), 308 – 312.
[5] Zheng Tian (2009). Measuring Agglomeration Using the Standardized Location Quotient with a Bootstrap Method. The journal of regional analysis and policy, 43(2), 186-197.
[6] Shilin Zhao, Yan Guo, et al., (2014). Advanced Heat Map and Clustering Analysis Using Heatmap3. BioMed Research International. 42(1), 186-192.
[7] Thomas P. A. Debray, Karel G. M. Moons, et al., (2018). Detecting small‐study effects and funnel plot asymmetry in meta‐analysis of survival data: A comparison of new and existing tests. Research synthesis methods, 9(1), 41-50.
[8] https://www.r-bloggers.com/revisiting-crimes-against-women-in-india/.