Difference between revisions of "Group11 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 88: Line 88:
 
<br>
 
<br>
 
==FUTURE WORK==
 
==FUTURE WORK==
 +
With the availability of Police Officer count at a precinct level, an optimization model along with our crime analysis, crime rate and police officer estimation calculations can be made to efficiently allocate the right number of police officers at a Precinct level.
 +
<br>
 
The NYPD crime data can be utilized for creating a location based predictive model like Epidemic-Type Aftershock Sequence Model for Crime Prediction. The ETAS model will utilize past crime data such as location, time of day and type of crime to predict hotspot locations for future crimes based on probability values. Visualization of findings will help assist law enforcement agencies for strategic or tactical action. The ETAS algorithm is based on the foundation of reaction-diffusion models of crime.  
 
The NYPD crime data can be utilized for creating a location based predictive model like Epidemic-Type Aftershock Sequence Model for Crime Prediction. The ETAS model will utilize past crime data such as location, time of day and type of crime to predict hotspot locations for future crimes based on probability values. Visualization of findings will help assist law enforcement agencies for strategic or tactical action. The ETAS algorithm is based on the foundation of reaction-diffusion models of crime.  
 
<br>
 
<br>
 
==ACKNOWLEDGEMENTS==
 
==ACKNOWLEDGEMENTS==
 
The authors wish to thank Prof. Kam Tin Seong for his guidance on the various analytical techniques and R packages that may be used and feedback on visualisation techniques. We would also like to thank Tan Ying Xuan, Nurul Asyikeen Binte Azhar and Rachel Tong of Term 1 2017-18.
 
The authors wish to thank Prof. Kam Tin Seong for his guidance on the various analytical techniques and R packages that may be used and feedback on visualisation techniques. We would also like to thank Tan Ying Xuan, Nurul Asyikeen Binte Azhar and Rachel Tong of Term 1 2017-18.

Revision as of 14:21, 13 August 2018

Night-skyline1.jpg

OVERVIEW

PROPOSAL

POSTER

APPLICATION

REPORT

BACK TO HOME

INTRODUCTION

Crime is an act punishable by law that has been timeless and has been committed practically since the start of time. Crime is prevalent in every society. Crime activities are monitored and recorded by law enforcement agencies for various purposes. One such purpose is crime analysis. Crime analysis involves systematic analysis for identifying and analyzing patterns and trends in crime and disorder. As crime data becomes increasingly available to the public, geo-spatial and temporal analysis of crime occurrence matures to provide better insights. This increased understanding will potentially contribute to enhanced law enforcement efforts and resource management.

In our research, we will be examining how geographic and date-time variables can be utilised along with crime type, vicinity and population data to better understand crime occurrences in the City of New York. Crime details were obtained from the New York Police Department (NYPD) complaint repository, along with the population data for the City of New York were obtained from the NYC Open Data Government repository for our analysis and visualizations. The research is propped in an interactive application built on R Shiny that allows users to explore, analyse and visualize data to derive insights. We have used R as the tool of choice in creating the web application due to its abundance of feature rich library of packages for statistical analysis and data visualization. With the data visualizations and user interface in this application, the user can easily filter, transform and visualize crime data to derive the insights. As a free software environment for statistical computing and graphics, R allows for availability for use by many, which would further encourage the spread of such visual analytics studies and initiatives across other fields.

OBJECTIVES AND MOTIVATION

The overall yearly crime numbers in NYC has seen a declining trend. However, in January 2017, The Public Advocate for the City of New York published an open letter to the Commissioner of the New York Police Department (NYPD) addressing the issue of high crime rates in certain neighbourhoods of NYC. It points out that the precincts associated with these neighbourhoods have shortage of detectives as compared to precincts in neighbourhoods with fewer crimes. It talks about an ongoing issue of poor resource allocation and deployment of police personnel with respect to crime numbers. For our crime analysis, we have used the NYPD crime data for the which spans crimes records across the 5 boroughs of NYC which consists of 77 precincts. We aim to provide a visual tool to observe the crime patterns with respect to time and location. On doing this, the rate of crime can be analyzed and hence forecasted for law enforcement agencies can estimate the right number of resources to deploy. Our research aims to incorporate geo-spatial and temporal analytics for better insights on crime occurrence modelled using the rich data of crime occurrences in New York City that may be replicated with increased availability of similar data in other cities across the world.

Through our analysis, we hope to address the following:
1. Provide a visual representation of crime statistics: Through our visual dashboard, we can provide a one stop view for the exploratory analysis of crime stats. The user will be able to view, compare and analyse crime stats based on type of crime, location, vicinity of crimes and time of day.
2. Forecast of crime numbers at Precinct Level: We aim to forecast and visualize the number of crimes and calculate crime rates at a Precinct level.
3. Crime Rate Calculation: Using Crime numbers and population data, we will be calculating the crime rates across each precinct.

PREVIOUS WORKS

Due to the easy availability and extensive detail of these datasets, it was expected that apps and dashboards on Crime analysis in New York City and other cities and countries would be available. One such example of New York Crime Analysis is this (https://minghao.shinyapps.io/crime_analysis/).
image
Although this app shows the overall crime levels, it fails to visualize crime stats at a precinct level.
For crimes occurring at vicinity level, the app shows a scatter plot showing the number of crimes vs the number of facilities with each color representing a certain type of crime.
image
The vicinity is only two levels of detail i.e. Public facility and Residential Area. A break up of the type of vicinities would give the user a better representation of the type of crimes occurring at different vicinities.
While this app provides us with necessary visualizations at an overall level, our app provides visuals at a more granular level i.e. at Precinct level, which aims to provide more detailed analysis on crime locations, time periods and crime types. Our analysis also involves time series forecasting which uses ARIMA forecasting techniques to give high accuracy results for two years monthly data based on 10 years of monthly crime data at a Precinct level.

DATASET AND DATA PREPERATION

The NYPD complaint data, Precinct Data, and the NYC Population were obtained from the NYC OpenData Government site (https://opendata.cityofnewyork.us/)

NYPD Complaint Data
This full dataset consists of 5.6 million rows of crime occurrences from the year 2006 to 2016 with 23 variables (Full list of variables and descriptions are available in the Appendix). Each row consists of the date and timestamp the crime occurred. For our analysis, we are taking only few of these variables (final list of variables are present in Appendix). Time and Date formats had to be made in a uniform format, Certain categorical variables like Time of day, Type of Offence and Vicinity of Crime had to be categorized and grouped into a smaller list of categories.
Precinct Data
The Precinct level data contains information and shape files for each precinct in NYC. This data was analysis ready. (List of variables for both datasets are present in the appendix)
Crime statistics Data
The aim is to estimate the number of resources required by the NYPD based on the number of crimes (forecasted?). The datasets used for the same are the Crime Rate Dataset and the Num_Offc Dataset. The two tables are of the same format. They each contain 77 records corresponding to each of the precincts while the columns represent years from 2010 to 2018. The tables contain year wise aggregates of the Crime Rate and the Number of Officers required for deployment. (The calculations and the data preparation for the same can be found in the appendix)
NYC Population Data
The Population data is the Precinct level population data for the year 2010. For our crime rate analysis, we will estimate the population for the years with reference to the percentage increase (List of variables for both datasets are present in the appendix).

DESIGN FRAMEWORK AND VISUALIZATION METHODOLOGY

Considering the best practises of visual analytics our analysis of the NYPD Crime dataset is done the following way:

1. A high-level view on the overall crime numbers and statistics.
2. Using the time-series data for an ARIMA forecasting.
3. Crime Statistics and Deployment of Resources.

INSIGHTS AND IMPLICATIONS

While the overall crime numbers and rates have reduced over the years, certain crime types have seen an increase. Brooklyn has the highest overall crimes numbers, but places like The Bronx has the highest crime rates for Dangerous Weapons and Drugs and Alcohol related crimes. Manhattan has the highest number of Sex related crimes compared to the other boroughs. The NYPD should deploy their specialized crime units at the affected boroughs based on these findings. Traffic offences have increased significantly over the years and tend to be on the higher side during the evening implying the NYPD traffic unit should deploy extra officers and resources during these peak hours.

Most crimes like assault and harassment occur at places of residence and on the street, while crimes like larceny tend to occur more at stores and supermarkets as well. While most major crimes like dangerous weapons, assault, sex crimes and drugs and alcohol related crimes occur at night, Burglaries are in high numbers during the afternoon and evenings. The NYPD can plan their patrolling patterns and shifts using these insights.

From a seasonal point of view, overall crimes numbers are low during the winter season i.e. from November to February. Holidays like Christmas and Thanksgiving saw low crimes numbers, but other significant holidays like Halloween, 4th of July, and St. Patricks day saw an increase in the number of crimes. Appropriate deployment of police officers and patrol units should be made during these days and season

FUTURE WORK

With the availability of Police Officer count at a precinct level, an optimization model along with our crime analysis, crime rate and police officer estimation calculations can be made to efficiently allocate the right number of police officers at a Precinct level.
The NYPD crime data can be utilized for creating a location based predictive model like Epidemic-Type Aftershock Sequence Model for Crime Prediction. The ETAS model will utilize past crime data such as location, time of day and type of crime to predict hotspot locations for future crimes based on probability values. Visualization of findings will help assist law enforcement agencies for strategic or tactical action. The ETAS algorithm is based on the foundation of reaction-diffusion models of crime.

ACKNOWLEDGEMENTS

The authors wish to thank Prof. Kam Tin Seong for his guidance on the various analytical techniques and R packages that may be used and feedback on visualisation techniques. We would also like to thank Tan Ying Xuan, Nurul Asyikeen Binte Azhar and Rachel Tong of Term 1 2017-18.