Difference between revisions of "Social Stratification Mappers Proposal"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 188: Line 188:
 
|  
 
|  
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
===1. Social Segregation Index at Subzone Level===
+
<b>1. Social Segregation Index at Subzone Level</b>
  
 
While there had been studies done on racial, age and housing mix, these were typically done at the aggregate level that measures a subzone’s composition of race, as compared to the national average. While this was informative, it only informed us where different population groups stay – and not whether they mix with one another. This provided little actionable insights for urban planners to turn into operations. What was more useful was whether there were sufficient opportunities for citizens of different race, age and housing type, to mix with one another. In spatial demography, measuring the extent of segregation between population groups would help urban planners to understand social cohesion and integration in our society [4].  
 
While there had been studies done on racial, age and housing mix, these were typically done at the aggregate level that measures a subzone’s composition of race, as compared to the national average. While this was informative, it only informed us where different population groups stay – and not whether they mix with one another. This provided little actionable insights for urban planners to turn into operations. What was more useful was whether there were sufficient opportunities for citizens of different race, age and housing type, to mix with one another. In spatial demography, measuring the extent of segregation between population groups would help urban planners to understand social cohesion and integration in our society [4].  
Line 198: Line 198:
 
|
 
|
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
===2. Spatial Point Pattern Analysis at Postal Code Level===
+
<b>2. Spatial Point Pattern Analysis at Postal Code Level</b>
  
 
There has been work done on the visualization of demographic data by subzone. However, by and large, these data are typically analysed by the subzone level than by the postal code level. We analyse our data at the postal code level because social inequality is typically defined at the household level, especially since Singapore is a small country where analyzing by subzone is equivalent to assuming different regions develop unequally. But point symbol map would not give a good visualization due to the abundant amount of postal code data. Instead, we use the Kernel Density Estimation, a frequently used technique to determine hotspots of point locations, to easily identify spatial clusters – by race, age and housing type.  
 
There has been work done on the visualization of demographic data by subzone. However, by and large, these data are typically analysed by the subzone level than by the postal code level. We analyse our data at the postal code level because social inequality is typically defined at the household level, especially since Singapore is a small country where analyzing by subzone is equivalent to assuming different regions develop unequally. But point symbol map would not give a good visualization due to the abundant amount of postal code data. Instead, we use the Kernel Density Estimation, a frequently used technique to determine hotspots of point locations, to easily identify spatial clusters – by race, age and housing type.  
Line 206: Line 206:
 
|
 
|
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
===3. Accessibility to Schools and Parks===
+
<b>3. Accessibility to Schools and Parks</b>
  
 
Using the Hansen Accessibility Model, we mapped out each household’s ease of access to schools – mainstream and elite schools. Accessibility refers to the ease of reaching a destination.  For our application, the origin zones would be the HDB blocks and the destination zones will be the amenities.  A higher Hansen Accessibility Index would mean that the HDB block is very accessible to that particular school type – elite or mainstream.  
 
Using the Hansen Accessibility Model, we mapped out each household’s ease of access to schools – mainstream and elite schools. Accessibility refers to the ease of reaching a destination.  For our application, the origin zones would be the HDB blocks and the destination zones will be the amenities.  A higher Hansen Accessibility Index would mean that the HDB block is very accessible to that particular school type – elite or mainstream.  
Line 216: Line 216:
 
|-
 
|-
 
|}
 
|}
<!-- END OF MOTIVATION & OBJECTIVES--->
+
<!-- END OF ANALYTICAL MAPPING--->
  
<div style="background: #56A5EC; padding: 20px; line-height: 0.3em; text-indent: 16px;letter-spacing:0.1em;font-size:26px"><font color=#fbfcfd face="Bebas Neue">TIMELINE & MILESTONES</font></div>
+
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Open Sans, Arial, sans-serif; border-radius: 7px; text-align:left">
+
<div style="padding: 17px">
 +
</div>
 +
<div style="background: #56A5EC; padding: 20px; line-height: 0.3em; text-indent: 16px;letter-spacing:0.1em;font-size:26px"><font color=#fbfcfd face="Bebas Neue"> DESIGN ARCHITECTURE</font></div>
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
The following table shows the timeline and milestones of our project. Details will be updated progressively. <br>
 
</div>
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 
</div>
 
 
|-
 
|-
 
|  
 
|  
<b>S/N</b>
+
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
||
+
===1. Visual Design Framework===
<b>Task</b>
 
||
 
<b>Done by</b>
 
||
 
<b>Week</b>
 
||
 
<b>Status</b>
 
|-
 
|
 
1
 
|| 
 
Topic Brainstorming
 
||
 
ALL
 
||
 
2 & 3
 
||
 
Completed ✔
 
|-
 
|
 
2
 
|| 
 
Drafting and refinement of Project Proposal
 
||
 
ALL
 
||
 
2 & 3
 
||
 
Completed ✔
 
|-
 
|
 
3
 
|| 
 
Consultation with Prof Kam for Feedback on Proposal
 
||
 
ALL
 
||
 
4
 
||
 
Completed ✔
 
|-
 
|
 
4
 
|| 
 
Finalization of Project Topic and Focus
 
||
 
ALL
 
||
 
5
 
||
 
Completed ✔
 
|-
 
|
 
5
 
|| 
 
<b>Compilation and Cleaning of Datasets</b><br>  
 
1. Geocoding and Categorization of Schools<br>
 
2. Combine all Demographics datasets (Housing, Age and Ethnic) for analysis by Subzone<br>
 
3. Convert all spatial files to WGS84 for OSM compatibility <br>
 
4. Data Clean, Manipulate and Overlay the Amenities/Spaces with Demographics datasets <br>
 
||
 
Grace <br>
 
Yuing
 
||
 
6
 
||
 
Completed ✔
 
|-
 
|
 
6
 
|| 
 
<b>1st Wiki Content Update</b><br>
 
(Proposal, Methodology & Storyboard)
 
||
 
Yuqing <br>
 
Yuanjing
 
||
 
6
 
||
 
Completed ✔
 
|-
 
|
 
7
 
|| 
 
Independent learning of R and R Shiny on DataCamp
 
||
 
ALL
 
||
 
6 & 7
 
||
 
In Progress ✔
 
|-
 
|
 
8
 
|| 
 
Consultation with Prof Kam for Feedback on tools for Geospatial Analysis
 
||
 
ALL
 
||
 
7
 
||
 
Completed ✔
 
|-
 
|
 
9
 
|| 
 
2nd Wiki Content Update
 
||
 
Yuqing
 
||
 
7
 
||
 
Completed ✔
 
|-
 
|
 
10
 
|| 
 
<b>Map Development </b><br>
 
PART 1: Kernel Density Estimate: Grace <br>
 
PART 2: Hansen Potential Analysis: Grace <br>
 
PART 3: Common Spaces for Solutioning: Yuanjing
 
  
||
+
The 3Ps Visual Design Framework was designed during the course of our R application development. First, we started off with the Purpose, and then we determined the Processes required to achieve it, followed by the R Packages required.
Grace <br>
+
[[File:VA-a.png|center|1000px]]
Yuanjing
 
||
 
7 - 9
 
||
 
  
 
|-
 
|-
 
|
 
|
11
+
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
|| 
+
===2. Data Preparation===
<b>Interface Development</b><br>
 
User Interface (Web Layout): Yuqing <br>
 
User Interface (Description) : Yuanjing
 
||
 
Yuqing <br>
 
Yuanjing
 
||
 
8 - 9
 
||
 
  
|-
+
First, aggregated data at the subzone level on race, housing type and age demographics were obtained from data.gov.sg. This data includes both public and private housing types. Household level data at the HDB postal code level on housing type were retrieved from Siew et al (2018)’s previous project on Centroid Amenities. Besides demographics data, schools and parks data were obtained from data.gov.sg.  
|
 
12
 
||  
 
Consultation with Prof Kam for Feedback on progress/techniques
 
||
 
ALL
 
||
 
9
 
||
 
  
|-
+
Some of the spatial files came raw as KML format. These were converted to SHP format using QGIS in order to be imported into R by the sf and rgdal packages. Primary school data were geocoded using the onemap API to obtain the geometry attributes from the postal code. Transformation to WGS84 was performed for openstreetmap compatibility, and separately transformed to SVY21 for distance-based measures for use in the Hansen Accessibility Model. Data was also transformed to ppp format for spatstat compatibility for the kernel density analysis.
|
 
13
 
|| 
 
Debugging and Analysis of Results
 
||
 
ALL
 
||
 
9
 
||
 
  
|-
+
As for the demographics data, percentages of the population of each demographic type out of the total population of each subzone was calculated. Normalization was performed so that the values were comparable across age, gender and housing type. Data merging and transformation was also done so that it could be analysed and visualized geospatially in R.
|
 
14
 
|| 
 
Consultation with Prof Kam for Feedback on
 
final product
 
||
 
ALL
 
||
 
10
 
||
 
  
 
|-
 
|-
 
|
 
|
15
+
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
|| 
+
===3. Interactivity & Insight Generation ===
Creating and Submission of Poster
 
||
 
Yuqing
 
||
 
11
 
||
 
  
|-
+
To allow users to interact, mapedit was used to allow urban planners to draw distance buffers to catch the nearby HDB postal codes that fall within a certain distance from any amenities.
|
+
[[File:Mapedit.png|center|1000px]]
16
 
|| 
 
Uploading of App on Shinyapps.io (Artefact)
 
||
 
Yuanjing
 
||
 
12
 
||
 
  
|-
+
A data table was also displayed for detail on demand references.
|
+
[[File:DT.png|center|1000px]]
17
 
|| 
 
Finalizing Wiki Page & Research Paper (Deliverables)
 
||
 
ALL
 
||
 
13
 
||
 
 
 
|-
 
|
 
18
 
|| 
 
Poster Presentation / Conference
 
||
 
ALL
 
||
 
14
 
||
 
  
 +
The dashboard also has an interactive toolbar that allows users to switch between the three dimensions of inequality – social segregation, spatial point pattern analysis, and accessibility to amenities. It also allows users to filter by Planning Area, Subzone and Housing Type.
 +
[[File:AB.png|center|1000px]]
 
|-
 
|-
 
|}
 
|}
 +
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 +
===4. Architecture Design Framework  ===
 +
The application was developed using R Shiny, with the following application architecture.
 +
[[File:aa.png|center|1000px]]
 +
<!-- END OF DESIGN ARCHITECTURE--->
  
<!-- END OF MILSTONES--->
 
 
 
<div style="background: #56A5EC; padding: 20px; line-height: 0.3em; text-indent: 16px;letter-spacing:0.1em;font-size:26px"><font color=#fbfcfd face="Bebas Neue">TOOLS & PACKAGES</font></div>
 
<div style="margin:0px; padding: 10px; background: #f2f4f4; font-family: Open Sans, Arial, sans-serif; border-radius: 7px; text-align:left">
 
The following is a list of tools to be adopted for the project’s scope.
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
|-
 
|
 
<div style="border-left: #56A5EC solid 10px;font-family: Helvetica; padding: 0px 30px 0px 18px; ">
 
*QGIS
 
*Open Street Map
 
*OneMap Geocode
 
*R Studio
 
*R libraries
 
**shiny
 
**leaflet
 
**rgdal
 
**sf
 
**sp
 
**spatstat
 
**dplyr
 
**plyr
 
**maptools
 
**shinydashboard
 
**spatialsegregation
 
**REAT
 
**SpatialAcc
 
**ggmap
 
**SpatialPosition
 
**maptools
 
**shinyBS
 
**shinyJS
 
  
|-
 
|}
 
<!-- END OF TOOLS-->
 
  
  

Revision as of 16:15, 7 August 2018

Exploring Inequality’s Geographic Dimension Across Neighbourhoods in Singapore: It's Driving Forces & Touch Points

OVERVIEW

PROPOSAL

POSTER

APPLICATION

RESEARCH PAPER

BACK TO HOMEPAGE

INTRODUCTION

Amidst the recent debate over growing social inequality in Singapore such as the distinct clustering of elite schools and varying access to resources, the dangers of hardening social mobility pose as a concern for a culturally diverse nation that has upheld its values of social cohesion and racial harmony. In bridging social divide, Singapore has put in place various community programmes to create shared experiences and promote inter-communities mixing.

Using geospatial techniques in R, the dashboard serves to explore the geographic dimension of social inequality, by mapping the extent of social segregation and accessibility to important spaces across neighbourhoods. This is done in three approaches. First, we analyse whether there exists social segregation across subzones using the Entropy-Based Diversity Index, based on three dimensions of inequality - race, age and housing type. Second, using spatial point pattern analysis at the HDB postal code level, we visualise whether there exist housing type clusters that could point towards social inequality and whether certain towns are overpopulated with a specific housing type. Third, using the Hansen Accessibility Model, we map out available touch points that could facilitate social mixing, particularly the ease of access to primary schools. We also see whether there is any variation in accessibility between the elite and mainstream primary schools for different housing types. This is because an important aspect of social inequality is having reasonably fair access to resources. Lastly, we move into solutioning and explore whether there exist sufficient common spaces that allow for social mixing, such as parks, and identify areas that are underserved for urban planners to focus their attention on for future space planning.

MOTIVATION & OBJECTIVES

1. Discover Drivers of Inequality – Race, Age, Housing Type

Our project was motivated by the ongoing debates on social inequality but a general lack of “hard evidence” especially on the geospatial aspects, of social inequality. Our dashboard aims to equip urban planners with a geospatial tool for visual discovering of social inequality across neighborhoods based on the three dimensions of drivers of inequality – race, age, housing type.

2. Find Touch Points for Social Mixing – Amenities & Common Spaces

We also map accessibility of each HDB postal code to the nearest school to see whether some households are disadvantaged in access to elite schools than mainstream schools. More importantly, we go beyond highlighting “pain points” but dive into “solutioning”. We hope to equip planners with the ability to visualize how the spaces (e.g. parks) they have built can serve as important touch points to promote social mixing within and across neighbourhoods.


DATA SOURCES

Our team's datasets are retrieved from https://data.gov.sg

Type

Format

Data

Source URL

Boundary (Polygon)

SHP

OSM Layer (Singapore)

OpenStreet Map

Boundary (Polygon)

SHP

Master Plan 2014 Subzone Boundary (No Sea)

https://data.gov.sg/dataset/master-plan-2014-subzone-boundary-no-sea

Demographics

CSV

Estimated Singapore Resident Population in HDB Flats

https://data.gov.sg/dataset/estimated-resident-population-living-in-hdb-flats

Demographics

CSV

Dwelling Units under HDB's Management, by Town and Flat Type

https://data.gov.sg/dataset/number-of-residential-units-under-hdb-s-management

Demographics

CSV

Residents by Age Group & Type of Dwelling, Annual

https://data.gov.sg/dataset/residents-by-age-group-type-of-dwelling-annual

Demographics

CSV

Land Area and Dwelling Units by Town

https://data.gov.sg/dataset/land-area-and-dwelling-units-by-town

Demographics

SHP

Singapore Residents by Subzone and Type of Dwelling, June 2016

https://data.gov.sg/dataset/singapore-residents-by-subzone-and-type-of-dwelling-june-2016

Demographics

SHP

Singapore Residents by Subzone, Age Group and Sex, June 2016 (Gender)

https://data.gov.sg/dataset/singapore-residents-by-subzone-age-group-and-sex-june-2016-gender

Demographics

SHP

Resident Population of Other Ethnic Groups by Age Group, Ethnic Group and Sex, 2015

https://data.gov.sg/dataset/resident-population-of-other-ethnic-groups-by-age-group-ethnic-group-and-sex-2015

Amenities

CSV

Primary Schools

https://data.gov.sg/dataset/school-directory-and-information

Spaces

SHP

Parks (including playgrounds)

https://data.gov.sg/dataset/parks

ANALYTICAL MAPPING

1. Social Segregation Index at Subzone Level

While there had been studies done on racial, age and housing mix, these were typically done at the aggregate level that measures a subzone’s composition of race, as compared to the national average. While this was informative, it only informed us where different population groups stay – and not whether they mix with one another. This provided little actionable insights for urban planners to turn into operations. What was more useful was whether there were sufficient opportunities for citizens of different race, age and housing type, to mix with one another. In spatial demography, measuring the extent of segregation between population groups would help urban planners to understand social cohesion and integration in our society [4].

We thus selected the entropy-based diversity index as the social segregation measure for race, age and housing type. This diversity index goes beyond measuring dissimilarity, isolation and interaction between two population groups (as is common for literature measuring Black-White segregation), and allows multi-group measure of segregation [5]. Living away from one another could imply that population groups are segregated [6]. We used this to measure whether citizens of the four national races experienced complete segregation, or no segregation; and do the same for housing type (an indicator of income) as well as age.
Social Segregation Score
Formula for Entropy-Based Diversity Index
This was then visualized in a choropleth map. Each subzone would have its entropy score, or diversity, defined as the follows, where πri refers to a particular racial/age/housing type group’s proportion of the population in subzone i.

We also plotted the histogram of segregation to see the distribution of the diversity indices, which can be filtered by subzone and housing type. For detail on-demand, we also complement the above map with a parallel coordinate plot to view the composition of each demographic type across subzones.

2. Spatial Point Pattern Analysis at Postal Code Level

There has been work done on the visualization of demographic data by subzone. However, by and large, these data are typically analysed by the subzone level than by the postal code level. We analyse our data at the postal code level because social inequality is typically defined at the household level, especially since Singapore is a small country where analyzing by subzone is equivalent to assuming different regions develop unequally. But point symbol map would not give a good visualization due to the abundant amount of postal code data. Instead, we use the Kernel Density Estimation, a frequently used technique to determine hotspots of point locations, to easily identify spatial clusters – by race, age and housing type.

This density-based measure is visualized in a kernel density map. To test statistically and allow users to have a distance-based measure, we support the visualization with the Ripley’s K Function and the Nearest Neighbour histogram. The K-Function is a method to estimate the second-order properties of a point process by constructing a circle of radius h around each point event, I and counting the number of other events j that fall inside this circle. Under the assumption of CSR, the expected number of events within distance h of an event is: 𝐾 ℎ = 𝜋ℎ2, where ℎ = the radius of the circle. K(h) > 𝜋ℎ2 if point pattern is clustered. This is visualized as such; if the line falls above the confidence band, there is clustering. If it falls within, the clustering is statistically insignificant.

3. Accessibility to Schools and Parks

Using the Hansen Accessibility Model, we mapped out each household’s ease of access to schools – mainstream and elite schools. Accessibility refers to the ease of reaching a destination. For our application, the origin zones would be the HDB blocks and the destination zones will be the amenities. A higher Hansen Accessibility Index would mean that the HDB block is very accessible to that particular school type – elite or mainstream.

We also assessed accessibility to parks (which includes playgrounds); common spaces where we expect them to be easy-to-reach, open platforms that can promote social mixing within neighbourhoods. We wanted to identify areas that are underserved for urban planners to focus their attention on for future space planning.

The approach of this measure to the calculation of a zone's accessibility is to add together the opportunities available in each other zone, weighted by a function of the difficulty of reaching that zone.


DESIGN ARCHITECTURE

1. Visual Design Framework

The 3Ps Visual Design Framework was designed during the course of our R application development. First, we started off with the Purpose, and then we determined the Processes required to achieve it, followed by the R Packages required.

VA-a.png

2. Data Preparation

First, aggregated data at the subzone level on race, housing type and age demographics were obtained from data.gov.sg. This data includes both public and private housing types. Household level data at the HDB postal code level on housing type were retrieved from Siew et al (2018)’s previous project on Centroid Amenities. Besides demographics data, schools and parks data were obtained from data.gov.sg.

Some of the spatial files came raw as KML format. These were converted to SHP format using QGIS in order to be imported into R by the sf and rgdal packages. Primary school data were geocoded using the onemap API to obtain the geometry attributes from the postal code. Transformation to WGS84 was performed for openstreetmap compatibility, and separately transformed to SVY21 for distance-based measures for use in the Hansen Accessibility Model. Data was also transformed to ppp format for spatstat compatibility for the kernel density analysis.

As for the demographics data, percentages of the population of each demographic type out of the total population of each subzone was calculated. Normalization was performed so that the values were comparable across age, gender and housing type. Data merging and transformation was also done so that it could be analysed and visualized geospatially in R.

3. Interactivity & Insight Generation

To allow users to interact, mapedit was used to allow urban planners to draw distance buffers to catch the nearby HDB postal codes that fall within a certain distance from any amenities.

Mapedit.png

A data table was also displayed for detail on demand references.

DT.png

The dashboard also has an interactive toolbar that allows users to switch between the three dimensions of inequality – social segregation, spatial point pattern analysis, and accessibility to amenities. It also allows users to filter by Planning Area, Subzone and Housing Type.

AB.png

4. Architecture Design Framework

The application was developed using R Shiny, with the following application architecture.

Aa.png



REFERENCES TO RELATED WORKS

1. We would like to credit the following referenced visualization works adopted in the design of our dashboard.

Centroid-Amenities: An Interactive Visual Analytical Tool for Exploring and Analysing Amenities in Singapore
Authors: Siew Xue Qian Jazreel, Tay Wei Xuan, Sean Koh Jia Ming
https://jazreelsiew.shinyapps.io/AppV2/

Map3.png


2. The following also lists the referenced literature considered in the design of our problem statement.

No

Title

Link

Author

1

Commentary: Inequality has a geographic dimension - between and within neighbourhoods in Singapore

https://www.channelnewsasia.com/news/commentary/inequality-in-singapore-exists-across-within-neighbourhoods-10276898

Leong Chan-Hoong

2

Lack of social mixing is a symptom of inequality, not a cause

https://www.straitstimes.com/opinion/lack-of-social-mixing-is-a-symptom-of-inequality-not-a-cause#main-content

Teo You Yenn

3

Class divide: Singapore in danger of becoming academic aristocracy

https://www.straitstimes.com/opinion/class-divide-singapore-in-danger-of-becoming-academic-aristocracy

Chua Mui Hoong

4

The Big Read: Social stratification — a poison seeping into S’pore’s housing estates and schools

https://www.todayonline.com/singapore/big-read-social-stratification-poison-seeping-spores-housing-estates-and-schools

Kelly Ng and Toh Ee Ming

5

COMMENT: Can Singapore's elite circle turn around growing social divide?

https://sg.news.yahoo.com/comment-can-singapores-elite-circle-turn-around-growing-divide-124724650.html

Nicholas Yong

6

This Is What Inequality Looks Like

Ethos Books (Publisher)

Teo You Yenn