REMGIS Proposal
Contents
Project Motivation
Project Objectives
1. Analyze the factors that make Singapore’s CBD successful
a. Identify and study the make-up of professional services within Singapore’s CBD b. Examine the landmarks that contribute to the CBD’s success c. Understand the factors that influence business locations
2. Replicate and evaluate Singapore’s CBD success factors to the upcoming second business district - Jurong Lake District
Data Preprocessing
Data Source
S/N | Description | Source |
---|---|---|
1 | Legal | http://www.yellowpages.com.sg/search/all/Legal |
2 | Bank | http://www.yellowpages.com.sg/category/banks |
3 | Accountancy | http://www.yellowpages.com.sg/search/all/Accountancy |
4 | Architectural | http://www.yellowpages.com.sg/search/all/Architectural |
5 | Management Consultancy | http://www.yellowpages.com.sg/search/all/Management+Consultancy |
Data Collection
To retrieve the records of all companies in these industries, we performed data scraping on all the companies in the above-mentioned sectors found in the Yellow Pages’ website using R. Once all the companies’ records were retrieved successfully, they were written and saved to a CSV file.
Data Cleaning
One of the problems with the data was that there were some companies that had incorrect postal codes. The incorrect postal codes followed two particular patterns – the first is that the postal code only had 5 digits while the second is that the postal code had a starting postal district which did not correspond to that of Singapore’s. These records had to be manually found by looking through each company’s address and finding postal codes that exhibited either one of the two patterns or both.
After addressing the postal code issue, we realized that for the data to be usable in R, we needed to retrieve the latitude and longitude coordinates of each company. This was done by using the "geocode" function from the "ggmap" package.
Here is a snippet of what the finalized dataset looks like:
Related Works
London’s Central Business District
Goal of study: Examine the range of London’s CBD activities, dynamism and competition of the CBD, the agglomeration of the various types of industries as well as the CBD’s future opportunities.
Kernel density estimation was used to measure the number of professional services’ jobs density per square kilometer. Although we were more interested in the makeup of firms in the CBD and not the number of jobs or employment, we decided that they could build upon this idea and apply kernel density estimation analysis on the various types of firms in Singapore’s CBD instead.
Guangzhou and Shenzhen
Goal of study: Analyze the cartographic definition and representation of the CBD by studying its urban development and functions.
A concentration index is presented to visualize the urban environment by using a density surface that is refined with network distances instead of Euclidean ones. At the end of the study, one of the conclusions of the paper stated that area-based methods of urban analysis such as kernel density estimation are widely used for the purposes of generating a continuous surface with density attribute. This was a positive indication to us to carry on with the initial idea of performing kernel density estimation analysis to analyze the makeup of firms in the CBD and JLD.
Methodology
Location Quotient Analysis
One of the ways to measure success was to determine if the region’s local needs are being met. As a result, we decided to analyse the location quotient values of the five sectors within the CBD and JLD areas, then form our conclusions.
Kernel Density Estimation
To find out about the density of each sector in the CBD, we decided to apply kernel density estimation. This would help to show the density of the various sectors as well as examine which areas of the CBD are more and less dense.
Quadrat Analysis
To determine if firms in the CBD are evenly spaced or clustered, we decided to use a quadrat analysis. We wanted to conduct a test of Complete Spatial Randomness for the 5 industry point patterns, based on quadrat counts before by using the quadrat.test() function of the “spatstat” package.
For the analysis, we will be performing a Monte Carlo simulation with 1000 trails instead of chi-squared tests, as this would obviate the need for all quadrat expected counts to be at least 5 as we noticed in the pre-run that the spatial points of all firms may range from 0 to x number in the CBD window.
Web Application Design
Inspiration
We gathered inspiration for the application’s UI by looking through and using multiple projects that were done by past groups. We took and combined elements that we liked then came up a storyboard for our own application.
Storyboard
Application Architecture
Architecture Overview
Application Layers
Application Overview
Map Tab
Features
- Allows user to upload their own firms data
- The format of the firms data must be in a csv file and in the same format as the data collected from our scrapping script, template of data format can be downloaded for the future.
- Allows user to select the analysis to perform
- Allows user to select the respective sector on the map
- The bandwidth of the Kernel Density plot.
- Smoother plot will be seen with a larger kernel distance while a smaller kernel distance will lead to a plot with more noise.
Spatial Point Analysis
KDE Analysis
Kernel Density Map Comparison Tab
Location Quotient Tab
Features
- Allows user to toggle the respective sector on the map
- Shows LQ figures for both CBD and JLD
- Shows Number of firms in respective regions
- Overview of CBD LQ in Bar Chart
- Overview of JLD LQ in Bar Chart
Quadrat Analysis Tab
Features
- Allows user to toggle the respective sector on the map
- Select respective rows and columns for the quadrat map
All Professional Firms
Selected Firms
Project Timeline
Project Challenges
S/N | Challenges | Solutions |
---|---|---|
1 | Data Scraping API is not readily accessible from Yellow Pages |
|
2 | Geocoding from postal codes to X and Y coordinates to perform thematic mapping |
|
3 | Inexperienced with R Shiny package and R programming |
|
4 | Unfamiliar with the implementation of spatial analysis methods such as:
|
|
Project Tools & Technologies
No. |
Name |
Date |
Comments |
1. |
Insert your Name here |
Insert Date here |
Insert Comment here |
2. |
Insert your Name here |
Insert Date here |
Insert Comment here |
3. |
Insert your Name here |
Insert Date here |
Insert Comment here |