Difference between revisions of "IS428-AY2019-20T1 Group09-Proposal"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[File:Logo.png|150px|frameless|center]]
+
[[File:G9TeamLogo.png|150px|frameless|center]]
<center>Team Name</center>
 
 
<!--Header-->
 
<!--Header-->
 
<p></p><br/>
 
<p></p><br/>
Line 6: Line 5:
 
{|style="background-color:#143c67; color:#4d79ff; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#143c67; color:#4d79ff; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
| style="padding:0.2em; font-size:100%; background-color:#143c67; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.2em; font-size:100%; background-color:#143c67; text-align:center; color:#F5F5F5" width="10%" |  
[[Team |<font color="#F5F5F5" size=3 face="Helvetica">Team</font>]]
+
[[IS428-AY2019-20T1_Group09-Team |<font color="#F5F5F5" size=3 face="Helvetica">Team</font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
 
| style="padding:0.2em; font-size:100%; background-color:#05050f;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.2em; font-size:100%; background-color:#05050f;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
[[Proposal|<font color="#F5F5F5" size=3 face="Helvetica">Proposal</font>]]
+
[[IS428-AY2019-20T1_Group09-Proposal|<font color="#F5F5F5" size=3 face="Helvetica">Proposal</font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
[[Poster|<font color="#F5F5F5" size=3 face="Helvetica">Poster</font>]]
+
[[IS428-AY2019-20T1_Group09-Poster|<font color="#F5F5F5" size=3 face="Helvetica">Poster</font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
[[Application|<font color="#F5F5F5" size=3 face="Helvetica">Application</font>]]
+
[[IS428-AY2019-20T1_Group09-Application|<font color="#F5F5F5" size=3 face="Helvetica">Application</font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 
| style="padding:0.2em; font-size:100%; background-color:#143c67;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
[[Research Paper|<font color="#F5F5F5" size=3 face="Helvetica">Research Paper</font>]]
+
[[IS428-AY2019-20T1_Group09-Research Paper|<font color="#F5F5F5" size=3 face="Helvetica">Research Paper</font>]]
 
|}  
 
|}  
 
</div>
 
</div>
 +
 
<!--/Header-->
 
<!--/Header-->
 +
<hr>
 +
{|style="background-color:#143c67; color:#4d79ff; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 +
| style="padding:0.2em; font-size:100%; background-color:#143c67; text-align:center; color:#F5F5F5" width="10%" |
 +
[[IS428-AY2019-20T1_Group09-Proposal-v1|<font color="#F5F5F5" size=3 face="Helvetica">Version 1</font>]]
 +
 +
| style="background:none;" width="1%" | &nbsp;
 +
| style="padding:0.2em; font-size:100%; background-color:#05050f;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |
 +
[[IS428-AY2019-20T1_Group09-Proposal|<font color="#F5F5F5" size=3 face="Helvetica">Version 2</font>]]
 +
|}
 
<br />
 
<br />
<big> [[Project Groups|<--- Go Back to Project Groups]] </big>
+
 
 
<br /><br />
 
<br /><br />
 
==<div style="background:#143c67; padding: 15px; font-weight: bold; line-height: 0.3em; letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>PROBLEM STATEMENT</center></font></div>==
 
==<div style="background:#143c67; padding: 15px; font-weight: bold; line-height: 0.3em; letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>PROBLEM STATEMENT</center></font></div>==
Line 61: Line 70:
 
| <center>District-wise Crimes Committed Against Women, 2015  <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2015 Click to View Data])<br/><br/><br/> District-wise Crimes Committed Against Women, 2014 <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2014 Click to View Data])</center>
 
| <center>District-wise Crimes Committed Against Women, 2015  <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2015 Click to View Data])<br/><br/><br/> District-wise Crimes Committed Against Women, 2014 <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2014 Click to View Data])</center>
 
||  
 
||  
 
 
 
* State/UT
 
* State/UT
  
Line 98: Line 105:
  
 
* Total Crimes against Women
 
* Total Crimes against Women
 
 
||  
 
||  
 
<center>The dataset would provide the crime rate for each type of crime against women, at a district-level. We can then aggregate the data to find trends.
 
<center>The dataset would provide the crime rate for each type of crime against women, at a district-level. We can then aggregate the data to find trends.
Line 105: Line 111:
 
| <center>dstrCAW_2013 <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2013 Click to View Data])<br/><br/><br/> dstrCAW_1 (2001-2012) <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2001-2012 Click to View Data])</center>
 
| <center>dstrCAW_2013 <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2013 Click to View Data])<br/><br/><br/> dstrCAW_1 (2001-2012) <br> ([https://data.gov.in/resources/district-area-wise-crimes-committed-against-women-during-2001-2012 Click to View Data])</center>
 
||
 
||
 
 
* STATE/UT
 
* STATE/UT
  
Line 125: Line 130:
  
 
* Importation of Girls
 
* Importation of Girls
 +
||
 +
<center>The dataset would provide the crime rate for each type of crime against women, at a district-level. We can then aggregate the data to find trends.</center>
 +
 +
|-
 +
| <center>2011 India Census Data
 +
([https://www.kaggle.com/webaccess/all-census-data/version/5#all.csv
 +
Click to View Data])</center>
 +
||
  
 +
* State
 +
 +
* District
 +
 +
* Literacy Rate
 +
 +
* Avg Household Size
 +
 +
* Number of Non-Workers
 +
 +
* Population
 +
 +
* Females per Male
 +
 +
* Persons Aged 15-59
 +
 +
* Number of Higher Secondary Graduates
  
 
||
 
||
<center>This data set will be used to understand the general demographic of international visitors coming to Korea from 2007 - 2018. We will be able to gain descriptive insights on the visitor demographics by Age Range.</center>
+
<center>This data set provides socioeconomic data per district, obtained from the 2011 Census Data.</center>
 +
 
 +
 
 
|-
 
|-
| <center>Entry by nationality by age
+
| <center>2011 India Census Data
(2007 - 2018)<br/><br/><br/>
+
([https://www.kaggle.com/danofer/india-census#india-districts-census-2011.csv
([http://know.tour.go.kr/stat/tourStatSearchDis.do;jsessionid=18780F0E8CFAFBBCEB58B0A96098EDD3 Click to View Data])</center>
+
Click to View Data])</center>
 
||
 
||
  
* City
+
* State Name
  
* Age Range
+
* District Name
  
* Date
+
* Population
  
 
||
 
||
<center>This data set will be used to understand the general demographic of international visitors coming to Korea from 2007 - 2018. We will be able to gain descriptive insights on the visitor demographics by Age Range.</center>
+
<center>This data set provides population data per district, obtained from the 2011 Census Data.</center>
 
|}
 
|}
 +
 
<br/>
 
<br/>
  
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>LITERATURE REVIEW</center></font></div>==
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>LITERATURE REVIEW</center></font></div>==
 
<br/>
 
<br/>
<!-- <center>
+
<center>
 
{| class="wikitable" style="background-color:#FFFFFF;" width="90%"
 
{| class="wikitable" style="background-color:#FFFFFF;" width="90%"
 
|-
 
|-
Line 155: Line 188:
 
|-
 
|-
 
| <center>
 
| <center>
'''Title''': Monthly Number of Individual Travelling Visitors (2016)
+
'''Title''': Crime Map of India
[[File:Example1.png|300px|frameless|center]]
+
[[File:Example1.png|400px|frameless|center]]
'''Source''':https://www.data.go.kr/visual/content/577
+
'''Source''':https://tvganesh.shinyapps.io/crimesAgainstWomenInIndia/
 
</center>
 
</center>
  
 
||  
 
||  
  
* We can understand the monthly visitor arrivals' patterns throughout the year.  
+
* The use of a choropleth map allows us to compare the magnitude of crimes against women in different states.
* However, we feel that a time series line graph would be more appropriate, and there should be a wider year range to better understand the exact trend in visitor arrivals.
+
* However, it does not account for the difference in the population size within each state, and merely takes the absolute number of crimes in each state as analysis.
  
 
|-
 
|-
 
| <center>
 
| <center>
'''Title''': Foreign tourists visiting Korea by 2015
+
'''Title''': RAPE IN INDIA: A visual exploration of systemic rape culture
[[File:Nogada.png|250px|frameless|center]]
+
[[File:Example2.png|400px|frameless|center]]
'''Source''':http://m.datanews.co.kr/m/m_article.html?no=2995
+
'''Source''':https://adityajain15.github.io/Rape_In_India/
 
</center>
 
</center>
 
||  
 
||  
  
* This bar chart displays the visits to various tourist attraction by year.  
+
* This treemap displays the relationship of the rape offenders to their victims.  
* Data labels and legends can help us see the precise figures.
+
* It also shows the treemap for each state, allowing for comparison across states.
 +
* The colour scheme of the treemap could be adjusted to be clearer to the viewer as it is hard to compare.
  
 
|-
 
|-
  
 
| <center>
 
| <center>
'''Title''': Most visited tourism attractions in South Korea 2015
+
'''Title''': RAPE IN INDIA: A visual exploration of systemic rape culture
[[File:BackgroundSurvey3Nogada.jpg|250px|frameless|center]]
+
[[File:Example3.png|400px|frameless|center]]
'''Source''':https://know.tour.go.kr/ptourknow/knowplus/kChannel/kChannelPeriod/kChannelPeriodDetail.do?seq=102612
+
'''Source''':https://adityajain15.github.io/Rape_In_India/
 
</center>
 
</center>
 
||  
 
||  
  
* This helps us to learn the most visited attractions by region as well as the increase / decrease as compared to the previous year.
+
* This visualisation shows the efficacy of the justice system in India in handling rape cases.  
* We can enhance on this idea to make an interactive map so that the user can analyze with filters.
+
* Each dot represents a single rape case in India, and the dots will travel to show the final outcome of the case - whether it ends in conviction or acquittal, or is dropped in the middle of the process.  
  
 
|-
 
|-
 
|}
 
|}
</center> -->
+
</center>
 
<br/>
 
<br/>
  
Line 196: Line 230:
 
<br/>
 
<br/>
  
<!-- Below are a few visualizations and charts we considered making for our projects.  
+
Below are a few visualizations and charts we considered making for our projects.  
  
 
<center>
 
<center>
Line 205: Line 239:
 
|-
 
|-
 
| <center>
 
| <center>
'''Title''': Sunburst Diagram
+
'''Title''': Chromosome-based Circos Plot
[[File:Sunburst.png|250px|frameless|center]]
+
[[File:Circos Plot.png|250px|frameless|center]]
'''Source''':https://bl.ocks.org/mbostock/4348373
+
'''Source''':https://jokergoo.github.io/circlize_book/book/
 
</center>
 
</center>
  
Line 213: Line 247:
  
 
*'''Pros:'''
 
*'''Pros:'''
** Aims to show various sub-components of a particular category
+
** Useful in showing data with multiple tracks in a single plot
** Can drill down to multiple divisions to observe the distribution by percentages
+
** Demonstrates relationships between variables
** May be useful to analyze tourism receipts by components and country
+
** Could be used to show how different socioeconomic factors affect different crime categories over time
  
 
*'''Cons:'''  
 
*'''Cons:'''  
** Difficult to break down the huge number of markets
+
** Difficult to formulate a Circos Plot
** Does not provide a comprehensive time-series comparison
+
** Overloading plot with information may lead to difficulty in interpreting it
 +
 
 +
|-
 +
| <center>
 +
'''Title''': Sunburst Diagram
 +
[[File:Sunburst.png|250px|frameless|center]]
 +
'''Source''':https://www.data-to-viz.com/graph/sunburst.html
 +
</center>
 +
||
 +
 
 +
*'''Pros''':
 +
** Shows hierarchy of multivariate data
 +
** Visually appealing and easy to distinguish between node and leaf nodes
 +
** Could be used to show the prominence of different crime types by district level
 +
 +
 
 +
*'''Cons''':
 +
** Difficult to label sunburst diagrams, which makes interactivity important
 +
** Tree maps will be more effective in displaying information at first glance
  
 
|-
 
|-
 
| <center>
 
| <center>
'''Title''': Treemap
+
'''Title''': Funnel Plot
[[File:Treemap.png|250px|frameless|center]]
+
[[File:Funnelplot.png|250px|frameless|center]]
'''Source''':https://www.theinformationlab.co.uk/2015/02/10/show-treemaps/
+
'''Source''':https://community.jmp.com/t5/JMP-Blog/Graph-Makeover-Where-same-sex-couples-live-in-the-US/ba-p/30616
 
</center>
 
</center>
 
||  
 
||  
  
 
*'''Pros''':
 
*'''Pros''':
** Effective visualisation to organise multivariate data by hierarchy
+
** Able to simultaneously display sample statistics and the corresponding sample size for multiple cases
** We can effectively see the purpose of visit for the top 10 visiting countries to Korea.
+
** Shows us what lies outside of the upper and lower limits
 +
** Useful in helping us determine outliers and abnormalities
 
   
 
   
  
 
*'''Cons''':  
 
*'''Cons''':  
** It would be hard to compare between years and months for different countries.
+
** Depending on variables used, funnel plot may result in publication bias if the magnitude of effect is different for population considered
** The hierarchy will only be 2 levels so the interaction would not be as much.
 
  
 
|-
 
|-
 
| <center>
 
| <center>
'''Title''': Chord Diagram
+
'''Title''': Geofacet Plot 
[[File:Chord.png|250px|frameless|center]]
+
[[File:Geofacet.png|250px|frameless|center]]
'''Source''':https://beta.observablehq.com/@mbostock/d3-chord-diagram
+
'''Source''':https://hafen.github.io/geofacet/
 
</center>
 
</center>
 
||  
 
||  
  
 
*'''Pros''':
 
*'''Pros''':
** Effective visualisation to see the influx of Visitors from and to Korea.
+
** Generates a plot for each of the different geographical regions
** We will be able to easily spot the country with the most travelers to Korea.
+
** Could be used to show us the different rates of crimes across the districts and states of India
 +
** Easy to interpret as plots are organised according to India's geography
 
   
 
   
  
 
*'''Cons''':  
 
*'''Cons''':  
** This chart will make it harder to spot trends in the visiting pattern.
+
** Complex plots cannot be used in geofacet plots
** We will not be able to see every single country as the size of the chord diagram is limited.
+
** Plots limited to bar and line charts
  
 
|-
 
|-
 
|}
 
|}
</center> -->
+
</center>
  
 
<br/>
 
<br/>
  
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>BRAINSTORMING SESSIONS</center></font></div>==
+
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>PROPOSED STORYBOARDS</center></font></div>==
<br/>
+
<center>
<!--
+
=== Storyboard 1 - Overview - Introduction to Crimes Against Women in India ===
[[File:BrainstormNogada.png|500px|frameless|center]]
 
  
<br/>
+
'''Visualisation 1: Bar Graph with Treemap'''
To come up with the story board, our group met up several times to try and come up with a visual. The two charts above are a few dashboards we considered making for our project. First Chart being a Chord diagram of visitors coming in and out of Korea. With an attractive look and great interaction, at first we felt that it was a good dashboard. But with some discussion and consultations, we realised that it would not help us achieve our goal of seeing the seasonal trend of visitors through the years. The second chart is a map with a hover function to see the popularity. We still stick to a similar design but instead of a tooltip like feel, we decided to give a whole section to the popularity part.  
+
</center>
 +
[[File:Storyboard1-1.png|500px|frameless|center]]
 +
* Aims to show the yearly increasing trend of number of crimes against women
 +
* Upon hovering over a particular year, the treemap showing the breakdown of crime types will be shown for that year.
  
After the sessions, we came up with our final designs which are listed below.
 
  
<br/>
+
<center>'''Visualisation 2: Line Graph''' </center>
 +
[[File:Storyboard1-2.png|500px|frameless|center]]
  
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>PROPOSED STORYBOARD</center></font></div>==
+
* Shows the trend for the individual crime type across the years
<br/>
 
 
 
Below is the proposed story board for our project:
 
  
 
<center>
 
<center>
{| class="wikitable" style="background-color:#FFFFFF;" width="90%"
+
=== Storyboard 2 - State-Level Comparison of Crime Rates ===
|-
 
! style="font-weight: bold;background: #141414;color:#fbfcfd;width: 45%;" | Storyboard
 
! style="font-weight: bold;background: #141414;color:#fbfcfd;width: 55%" | Insights / Comments
 
|-
 
| <center>
 
'''Title''': Storyboard 1 - Seasonal Trend
 
[[File:Storyboard1Nogada.jpg|300px|frameless|center]]
 
 
</center>
 
</center>
  
||
 
* Aims to show the seasonal trend of yearly visitor arrivals by country and month.
 
  
* There is a slider to select the year range and two dropdowns to analyse specific months and countries.
+
<center>'''Visualisation 1: Chloropleth Map'''</center>
 +
[[File:Storyboard2-1.png|600px|frameless|center]]
 +
* Shows comparison of crime rates across different states
 +
* Usage of colour scale for easy identification of crimes with higher crime rates
 +
* Allows user to filter by year in order to visualise changes in crime rates over the years.
 +
* Clicking on the button at the bottom would direct the user to Visualisation 3 to view the crime breakdown per state.
 +
 
 +
 
 +
<center>'''Visualisation 2: Funnel Plot'''</center>
 +
[[File:Storyboard2-2.png|400px|frameless|center]]
 +
* Would be displayed side by side with Visualisation 1
 +
* If the user clicks on a particular state on the chloropleth map, its relative position on the funnel plot will be highlighted
 +
* Usage of the funnel plot allows us to account for differences in population size within each state, as we would be able to plot Population size vs. Number of Crimes against Women for each State.
 +
* Allows us to identify states with abnormally low/high crime rates for further analysis to be done.
  
|-
 
| <center>
 
'''Title''': Storyboard 2 - Tourist Attractions
 
[[File:Storyboard2Nogada.jpg|400px|frameless|center]]
 
</center>
 
||
 
  
* Aims to show the yearly / monthly trend in popular tourism destinations, filtered by year and month.
+
<center>'''Visualisation 3: Geo Facet'''</center>
 +
[[File:Storyboard2-31.png|400px|frameless|center]]
 +
* Bar Graph in each state shows occurrence of each crime type in each state
 +
* Allows easy comparison across states for which crime type is most common
 +
* Each cell in the grid shows the distribution of crime type versus just a single value, unlike the chloropleth map
  
* After selecting a region, the map will zoom in and display various tourism attractions within that region.
+
<center>
+
===Storyboard 3 - Analysis of Socioeconomic Factors in contributing to Crime Rate Against Women===
* When the user hovers over a specific destination, there will be a time series graph of the local / foreigner arrivals as well as a picture of the destination.
 
  
|-
+
'''Visualisation 1: Correlation Matrix'''
| <center>
 
'''Title''': Storyboard 3 - Demographic information
 
[[File:Photo 2018-11-25 15-55-54 (2).jpg|400px|frameless|center]]
 
 
</center>
 
</center>
||  
+
[[File:Storyboard 3-1.png|600px|frameless|center]]
 +
* Shows the correlation between each socioeconomic factor and the different type of crime against women.
 +
* Color scale highlights if correlation is positive or negative
 +
* Individual values show how strong the correlation is
 +
* Clicking on a individual value will highlight the corresponding point on the Correlation heatmap as shown in Visualisation 3-2
 +
 
 +
 
 +
<center>'''Visualisation 2: Correlation Heat Map'''</center>
 +
[[File:Storyboard 3-2.png|600px|frameless|center]]
 +
* Shows the correlation between each socioeconomic factor in a more visually appealing manner
 +
* Color scale highlights how correlated each variables are to one another
 +
* Clicking on a individual box will highlight the corresponding point on the Correlation matrix as shown in Visualisation 3-1
  
* Aims to provide demographic information and the purpose of arrival through trend line charts and other barcharts.
 
  
* Users can select the year/month and country from the dropdown list.
+
<center>'''Visualisation 3: Parallel Coordinate Plot'''</center>
 +
[[File:Storyboard 3-3.png|600px|frameless|center]]
 +
* Compares the socio-economic factors with one another and helps visualise the relationship they have with one another
 +
* Shows how respective socio-economic factors affect each other with respect to the crime rates against women
 +
* Boxplot helps determine if the different factors are in the lower or upper range
  
|-
+
=== Storyboard 4 - Analysis of Police Disposal of Crimes Against Women ===
|}
+
<center>'''Visualisation 1: Sankey Plot'''</center>
</center>
+
[[File:Storyboard4.1.png|600px|frameless|center]]
-->
+
* Shows flow of cases involving violence against women from reporting to investigation outcome
<br/>
+
* Shows the efficiency and effectiveness of the justice system in dealing with such violent crimes against women
  
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>TECHNOLOGIES</center></font></div>==
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>TECHNOLOGIES</center></font></div>==
 
<br/>
 
<br/>
<!--
+
The tools we will be using for this Project is as follows: <br>
The technologies we will be using for this Project is as below:
+
[[File:Approach.png|900px|frameless|center]]
[[File:ArchitectureDiagram.png|650px|frameless|center]]
 
 
<br/>
 
<br/>
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>CHALLENGES</center></font></div>==
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>CHALLENGES</center></font></div>==
Line 340: Line 403:
 
|-
 
|-
 
|
 
|
* Unfamiliarity of Visualization Technologies such as Tableau, R,Rshiny etc.
+
Lack of proficiency in using R and R Shiny
 
||  
 
||  
  
* Attending Workshop on R and Rshiny
+
* Complete DataCamp courses on the relevant technologies
* Hands-on Practice with different technologies.
+
* Watch tutorial videos
* Peer Learning.
+
* Read the documentation
  
 
|-
 
|-
 
|
 
|
* Data Cleaning & Transformation from messy data.
+
District Crime Rates are in separate files, with different data attributes.
 
||  
 
||  
 +
* Clean the data to ensure that the columns are similar
 +
* Consolidate the data into one file for the years 2001-2015.
 +
|-
 +
|
 +
Difficulty in understanding some of the data attributes due to its local context, such as the different acts for protection against women found in some of our datasets.
  
* Organize Meeting Sessions to meet and do data cleaning and transformation together.
+
||
* Split the work between team members.
+
* Conduct more research on India and its history of crimes against women to get a better understanding of the data.
* Python Scripting to find Top 10, Sort data and translate.
+
|- 
 +
|
 +
Difficulty in finding socioeconomic factors by state level
  
|-
+
||
|  
+
* Look to different data sources to find more socioeconomic factors for consideration.
* Integrating Relevant Data from Multiple Sources Proposed Solution.
+
|-
||  
+
|
 +
The socioeconomic factors identified may not be indicative of the crime rate against women in India, as there may not be a relationship between the two.
  
* Working together to decide on what data to extract or eliminate.
+
||
* Trial and test for one set of data together before splitting the work.
+
* Do EDA to discover any correlation between each socioeconomic factor and the crime rate, then select the relevant ones from there
 
 
|-
 
 
|}
 
|}
 
</center>
 
</center>
-->
 
 
<br/>
 
<br/>
  
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>TIMELINE</center></font></div>==
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>TIMELINE</center></font></div>==
 
<br/>
 
<br/>
<!--
+
[[File:Timeline.png|700px|frameless|center]]
[[File:NogadaGantt.png|1000px|frameless|center]]
 
-->
 
 
<br/>
 
<br/>
 +
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>COMMENTS</center></font></div>==
 
==<div style="background:#143c67; padding:15px; font-weight: bold; line-height: 0.3em;letter-spacing:0.5em;font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>COMMENTS</center></font></div>==
 
<br/>
 
<br/>
<!--
+
Feel free to leave us any comments!
Feel free to leave us some comments so that we can improve!
 
  
 
<center>
 
<center>
Line 405: Line 471:
 
|}
 
|}
 
</center>
 
</center>
-->
 

Latest revision as of 16:03, 24 November 2019

G9TeamLogo.png


Team

 

Proposal

 

Poster

 

Application

 

Research Paper


Version 1

 

Version 2




PROBLEM STATEMENT


According to the Thomson Reuters Foundation Annual Poll, India is ranked as the world’s most dangerous country for women. This is not surprising, as India has had a long-standing history of violence against women, which is deeply rooted in certain cultural practices such as female infanticide and acid attacks.

Even with the increasing public outcry regarding such discrimination and the enactment of laws protecting women, the number of crimes committed against women is increasing steadily over the years.

MOTIVATION


India is one of the world’s fastest growing economies. It is currently the seventh richest country in the world, and is projected to be the third largest economy in the world. Despite its rapid growth and development, women in India still suffer from long-standing gender inequality and are the victims to brutal and inhumane crimes. Hence, there is a need to analyse various socioeconomic factors to garner insights on the root causes for crimes against women to understand why this phenomenon is so.

OBJECTIVES


Our objectives of this project are as follows:

  1. Provide an overview of the issue of crimes against women in India
  2. Draw comparisons to study the differences in crime rates between different states
  3. Study the effect of various socioeconomic factors on the number of crimes committed against women

We hope to achieve these objectives by developing interactive visualisations which can help us to understand the increasing trend of crimes against women, and what factors may contribute to such crime rates.

DATA SOURCES


We have obtained the following datasets for this research:

Dataset/Source Data Attributes Purpose
District-wise Crimes Committed Against Women, 2015
(Click to View Data)


District-wise Crimes Committed Against Women, 2014
(Click to View Data)
  • State/UT
  • Sl No.
  • District
  • Year
  • Rape
  • Attempt to commit Rape
  • Kidnapping & Abduction_Total
  • Dowry Deaths
  • Assault on Women with intent to outrage her Modesty_Total
  • Insult to the Modesty of Women_Total
  • Cruelty by Husband or his Relatives
  • Importation of Girls from Foreign Country
  • Abetment of Suicides of Women Dowry Prohibition Act, 1961
  • Indecent Representation of Women (P) Act, 1986
  • Protection of Children from Sexual Offences Act
  • Protection of Women from Domestic Violence Act, 2005
  • Immoral Traffic Prevention Act
  • Total Crimes against Women
The dataset would provide the crime rate for each type of crime against women, at a district-level. We can then aggregate the data to find trends.
dstrCAW_2013
(Click to View Data)


dstrCAW_1 (2001-2012)
(Click to View Data)
  • STATE/UT
  • DISTRICT
  • Year
  • Rape
  • Kidnapping and Abduction
  • Dowry Deaths
  • Assault on women with intent to outrage her modesty
  • Insult to modesty of Women
  • Cruelty by Husband or his Relatives
  • Importation of Girls
The dataset would provide the crime rate for each type of crime against women, at a district-level. We can then aggregate the data to find trends.
2011 India Census Data

([https://www.kaggle.com/webaccess/all-census-data/version/5#all.csv

Click to View Data])
  • State
  • District
  • Literacy Rate
  • Avg Household Size
  • Number of Non-Workers
  • Population
  • Females per Male
  • Persons Aged 15-59
  • Number of Higher Secondary Graduates
This data set provides socioeconomic data per district, obtained from the 2011 Census Data.


2011 India Census Data

([https://www.kaggle.com/danofer/india-census#india-districts-census-2011.csv

Click to View Data])
  • State Name
  • District Name
  • Population
This data set provides population data per district, obtained from the 2011 Census Data.


LITERATURE REVIEW


Reference of Other Interactive Visualization Learning Point

Title: Crime Map of India

Example1.png

Source:https://tvganesh.shinyapps.io/crimesAgainstWomenInIndia/

  • The use of a choropleth map allows us to compare the magnitude of crimes against women in different states.
  • However, it does not account for the difference in the population size within each state, and merely takes the absolute number of crimes in each state as analysis.

Title: RAPE IN INDIA: A visual exploration of systemic rape culture

Example2.png

Source:https://adityajain15.github.io/Rape_In_India/

  • This treemap displays the relationship of the rape offenders to their victims.
  • It also shows the treemap for each state, allowing for comparison across states.
  • The colour scheme of the treemap could be adjusted to be clearer to the viewer as it is hard to compare.

Title: RAPE IN INDIA: A visual exploration of systemic rape culture

Example3.png

Source:https://adityajain15.github.io/Rape_In_India/

  • This visualisation shows the efficacy of the justice system in India in handling rape cases.
  • Each dot represents a single rape case in India, and the dots will travel to show the final outcome of the case - whether it ends in conviction or acquittal, or is dropped in the middle of the process.


CONSIDERATION & VISUAL SELECTION


Below are a few visualizations and charts we considered making for our projects.

Visual Considerations Insights / Comments

Title: Chromosome-based Circos Plot

Circos Plot.png

Source:https://jokergoo.github.io/circlize_book/book/

  • Pros:
    • Useful in showing data with multiple tracks in a single plot
    • Demonstrates relationships between variables
    • Could be used to show how different socioeconomic factors affect different crime categories over time
  • Cons:
    • Difficult to formulate a Circos Plot
    • Overloading plot with information may lead to difficulty in interpreting it

Title: Sunburst Diagram

Sunburst.png

Source:https://www.data-to-viz.com/graph/sunburst.html

  • Pros:
    • Shows hierarchy of multivariate data
    • Visually appealing and easy to distinguish between node and leaf nodes
    • Could be used to show the prominence of different crime types by district level


  • Cons:
    • Difficult to label sunburst diagrams, which makes interactivity important
    • Tree maps will be more effective in displaying information at first glance

Title: Funnel Plot

Funnelplot.png

Source:https://community.jmp.com/t5/JMP-Blog/Graph-Makeover-Where-same-sex-couples-live-in-the-US/ba-p/30616

  • Pros:
    • Able to simultaneously display sample statistics and the corresponding sample size for multiple cases
    • Shows us what lies outside of the upper and lower limits
    • Useful in helping us determine outliers and abnormalities


  • Cons:
    • Depending on variables used, funnel plot may result in publication bias if the magnitude of effect is different for population considered

Title: Geofacet Plot

Geofacet.png

Source:https://hafen.github.io/geofacet/

  • Pros:
    • Generates a plot for each of the different geographical regions
    • Could be used to show us the different rates of crimes across the districts and states of India
    • Easy to interpret as plots are organised according to India's geography


  • Cons:
    • Complex plots cannot be used in geofacet plots
    • Plots limited to bar and line charts


PROPOSED STORYBOARDS

Storyboard 1 - Overview - Introduction to Crimes Against Women in India

Visualisation 1: Bar Graph with Treemap

Storyboard1-1.png
  • Aims to show the yearly increasing trend of number of crimes against women
  • Upon hovering over a particular year, the treemap showing the breakdown of crime types will be shown for that year.


Visualisation 2: Line Graph
Storyboard1-2.png
  • Shows the trend for the individual crime type across the years

Storyboard 2 - State-Level Comparison of Crime Rates


Visualisation 1: Chloropleth Map
Storyboard2-1.png
  • Shows comparison of crime rates across different states
  • Usage of colour scale for easy identification of crimes with higher crime rates
  • Allows user to filter by year in order to visualise changes in crime rates over the years.
  • Clicking on the button at the bottom would direct the user to Visualisation 3 to view the crime breakdown per state.


Visualisation 2: Funnel Plot
Storyboard2-2.png
  • Would be displayed side by side with Visualisation 1
  • If the user clicks on a particular state on the chloropleth map, its relative position on the funnel plot will be highlighted
  • Usage of the funnel plot allows us to account for differences in population size within each state, as we would be able to plot Population size vs. Number of Crimes against Women for each State.
  • Allows us to identify states with abnormally low/high crime rates for further analysis to be done.


Visualisation 3: Geo Facet
Storyboard2-31.png
  • Bar Graph in each state shows occurrence of each crime type in each state
  • Allows easy comparison across states for which crime type is most common
  • Each cell in the grid shows the distribution of crime type versus just a single value, unlike the chloropleth map

Storyboard 3 - Analysis of Socioeconomic Factors in contributing to Crime Rate Against Women

Visualisation 1: Correlation Matrix

Storyboard 3-1.png
  • Shows the correlation between each socioeconomic factor and the different type of crime against women.
  • Color scale highlights if correlation is positive or negative
  • Individual values show how strong the correlation is
  • Clicking on a individual value will highlight the corresponding point on the Correlation heatmap as shown in Visualisation 3-2


Visualisation 2: Correlation Heat Map
Storyboard 3-2.png
  • Shows the correlation between each socioeconomic factor in a more visually appealing manner
  • Color scale highlights how correlated each variables are to one another
  • Clicking on a individual box will highlight the corresponding point on the Correlation matrix as shown in Visualisation 3-1


Visualisation 3: Parallel Coordinate Plot
Storyboard 3-3.png
  • Compares the socio-economic factors with one another and helps visualise the relationship they have with one another
  • Shows how respective socio-economic factors affect each other with respect to the crime rates against women
  • Boxplot helps determine if the different factors are in the lower or upper range

Storyboard 4 - Analysis of Police Disposal of Crimes Against Women

Visualisation 1: Sankey Plot
Storyboard4.1.png
  • Shows flow of cases involving violence against women from reporting to investigation outcome
  • Shows the efficiency and effectiveness of the justice system in dealing with such violent crimes against women

TECHNOLOGIES


The tools we will be using for this Project is as follows:

Approach.png


CHALLENGES


Challenges Mitigation Plan

Lack of proficiency in using R and R Shiny

  • Complete DataCamp courses on the relevant technologies
  • Watch tutorial videos
  • Read the documentation

District Crime Rates are in separate files, with different data attributes.

  • Clean the data to ensure that the columns are similar
  • Consolidate the data into one file for the years 2001-2015.

Difficulty in understanding some of the data attributes due to its local context, such as the different acts for protection against women found in some of our datasets.

  • Conduct more research on India and its history of crimes against women to get a better understanding of the data.

Difficulty in finding socioeconomic factors by state level

  • Look to different data sources to find more socioeconomic factors for consideration.

The socioeconomic factors identified may not be indicative of the crime rate against women in India, as there may not be a relationship between the two.

  • Do EDA to discover any correlation between each socioeconomic factor and the crime rate, then select the relevant ones from there


TIMELINE


Timeline.png


COMMENTS


Feel free to leave us any comments!

No. Name Date Comments
1. Insert your name here Insert date here Insert comment here
2. Insert your name here Insert date here Insert comment here
3. Insert your name here Insert date here Insert comment here