Qui Vivra Verra - Geospatial Dashboard

From Analytics Practicum
Revision as of 16:59, 2 December 2016 by Cxpong.2013 (talk | contribs)
Jump to navigation Jump to search



  HOME

  ABOUT US

  PROJECT OVERVIEW

  PROJECT FINDINGS

  PROJECT MANAGEMENT

  DOCUMENTATION



Review of Similar Work

Interactive Visualization of Geospatial Data

Building a dashboard for interactive visualization of geospatial data is not a novel idea, and has been used by many organizations around the world, including the Singapore Government. OneMap.sg is the Singapore government’s attempt to provide a service to visualize the geospatial information provided by the various government agencies, and has a variety of useful functions to aid users in visualizing data, which we will look at.

Firstly, OneMap.sg provides a large choice of base layers and data layers, allowing users to visualize geospatial data about a large variety of subjects. Users are also able to upload their own datasets onto to visualization service. The ability for users to upload additional data for visualization could be applied to the project, in which we will prepare various data layers that are most likely to have an impact on library patronage.

Next, OneMap.sg allows users to filter out unwanted data through a myriad of query functions like landQuery, SchoolQuery and BizQuery, allowing users to only see the data they want. This is another useful function that has relevant applications for the project, as it avoids cluttering the map. This will be implemented as selectable layers in the dashboard, allowing users to choose the layers they wish to visualize.

In addition to the large amount of data available for visualization on OneMap.sg and the utility functions, the dashboard also provides some basic analysis functions, like allowing users to measure distances and areas on the map.

Geospatial1.png

While OneMap.Sg is useful for an introductory exploration into geospatial data, its lack of advanced analysis functions unfortunately limits the amount of insights that can be drawn from the data provided. However, it is sufficient for serving as a basic design for the project.

Dashboard Visualizations

To leverage the statistical methods in R and wide range of libraries to create the dashboard functions and interface. Furthermore, as R is open-sourced, our sponsor can access the dashboard without purchasing commercial software.

List of R packages used:

  • maptools: To manipulate spatial data
  • rgdal: To read in the kml of subzone and planning area using readOGR
  • leaflet: To create interactive web maps
  • geosphere: To get the distance between a library and subzone centroid
  • classInt: To group data points into 5 Jenks classification
  • plyr: To rename rows and columns of dataframe
  • shinydashboard: To quickly create a look and feel of a dashboard

Markers and Layers

Geospatial2.png

The dashboard for our project is built using a combination of Shiny R for its analytics functions like clustering and huff’s model and Leaflet.js for its geospatial visualization methods. To build the layers, we first read the various csv files containing information about the layers into R as data frames using R’s read.csv and readOGR methods.

Geospatial3.png

The data frames are then converted to layers of markers, and placed on a base map using leaflet’s addTiles and addMarkers methods.

Geospatial4.png

This basic visualization allows users to explore the data provided, and view the distribution of the locations of the facilities and libraries around the country.

Adjustable Buffer

Geospatial5.png

The dashboard provides some descriptive statistics for the users through the adjustable buffer function. Users can select the library they wish to visualize using a drop down list on the right hand side of the dashboard. Then using a slider on the right hand side of the dashboard, users can select the radius of the buffer area around the selected library.

The popup around the library displays the number of each type of facility that falls within the buffer. This function allows the user to tell at a glance the number of facilities within a set distance of a library, allowing them to do a number of evaluations, like the centrality of the library using the number of train stations in the vicinity, or evaluate the amount of traffic around the library by looking at the number of shopping and tuition centres around it. Furthermore, the statistics provided by the buffer will be used to calculate the attractiveness index of a library for the huff’s model discussed later in the project. The adjustable buffer is accomplished by the method below.

Geospatial6.png

The method takes in the coordinates of a library, a list of data frames containing the data of the facilities, and the radius of the buffer. It then computes the haversine distance from the library to each point in the data frames, and counts the number of points that lies within the buffer radius.

Choropleth Map

The dashboard uses a choropleth map to visualize the patron distribution for a selected library, with the colour intensity of the choropleth map representing the number of patrons in a planning area. A higher intensity represents a larger number of patrons. This visualization allows users to look for anomalies in the distribution of a library’s patron across the country, like excessively low or high number of patrons in a planning area going to the library.

Geospatial7.png

To accomplish this visualization, a subset (the selected library) of the patron flow data for all libraries is extracted. Using this information, the data is further classified into 5 groups using the Jenks classification method, and a colour palette is created using this classification.

Geospatial8.png

Next, polygons are created for each planning area, and coloured according to the number of patrons in each planning area.

Geospatial9.png

When a user adds or removes a library to/from the existing list of libraries in the dashboard, a recalculation of the choice probabilities is performed and the visualizations will be updated accordingly. This will allow the user to compare the differences in patron-flow when changes are made to the location of a library vis-à-vis the locations of all the libraries.

Implementing the Huff's Model


The Adapted RFM Analysis