ANLY482 AY2017-18 T2 Group 31 Model Buidling and Analysis

From Analytics Practicum
Jump to navigation Jump to search

Bannernew.png

HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT ANALYSIS

 

PROJECT MANAGEMENT

 

ANLY482 HOMEPAGE

Data

Exploratory Data Analysis

Model Building

Recommendation

Model Building

Modified L Test via Ripley's K Function
To determine:
1. If the notifications appear to be clustered or randomly distributed in our area of interest
2. Minimum radius distance which shows signs of statistically significant clustering

RKFunction.png
RKFunctionFormula.png

Number of observed notifications is compared to the number of notifications expected based on Complete Spatial Randomness (CSR)
CSR assumes distribution of points is homogeneous over the study area
Null hypothesis: the spatial points are randomly distributed, using alpha = 0.01

GraphicalOutputK.png

Bold line represents the observed values for a range of π‘Ÿ
Red dotted line represents the expected theoretical value for a range of π‘Ÿ
Grey area is the confidence envelope obtained through 100 iterations of Monte Carlo procedures based on assumptions from CSR
For each simulated point pattern, 𝐾(π‘Ÿ) is estimated over a range of π‘Ÿ. The max and min of these functions define an upper and lower simulation of the envelope

Converting K-function to L function, and to Modified L function

Converting.png

Interpreting the Modified L Test graph

Interpret.png

Kernel Density Estimation
To determine:
1. To identify cluster of locations that have higher occurrence of indiscriminate parkings

Function (kernel π‘˜) of a given radius (π‘Ÿ) β€œvisits” each point in the study region. π‘˜ provides the weight of the area surrounding 𝑠 in proportion to its distance to 𝑠_𝑖

KDEformula.png

π‘˜ is calculated as a function of the distance between point 𝑠 and 𝑠_𝑖, over given radius π‘Ÿ
The density of the study region is obtained by summing π‘˜ of all points 𝑠_𝑖 within π‘Ÿ

LargeBW.png SmallBW.png

Kernel Density Estimations are sensitive to changes in radius values
Large radius leads to a smoother curve, but local details would be obscured
Small radius leads to many small spikes that are very localised
Using the statistically significant radius distance obtained from Modified L Test as a search radius within each event

Interpolate.png

Perform interpolation by transforming the graph to make it smoother
Individual kernels are summed up to produce a smooth surface
Quartic kernel type is used in QGIS