Difference between revisions of "SMT201 AY2018-19T1 EX1 Lim Jia Khee"

From Geospatial Analytics for Urban Planning
Jump to navigation Jump to search
Line 32: Line 32:
 
<br>
 
<br>
 
=== Reasoning behind classification choices, new derived variables, missing values and assumptions ===
 
=== Reasoning behind classification choices, new derived variables, missing values and assumptions ===
I decided to work with Singapore's Aged Population (65+) for years 2010 and 2017 as I could not find any dataset on data.gov that is for Singapore's residents by aged group and gender, year 2018. The next best alternative is dataset for Singapore's residents by aged group and gender in year 2017. Also, the dataset was only available in .kml extension as there was no files available for .shp or .csv extensions.
+
I decided to work with Singapore's Age Population (65+) for years 2010 and 2017 as I could not find any dataset on data.gov that is for Singapore's residents by age group and gender, year 2018. The next best alternative is dataset for Singapore's residents by age group and gender in year 2017. Also, the dataset was only available in .kml extension as there was no files available for .shp or .csv extensions. The first assumption I made in this analysis is that the population data from 2017 will appear to have major similarities with the population data in 2018, hence rendering my analysis relevant.
 
<br><br>
 
<br><br>
xx
+
In handling the 2017 Singapore's residents by age group and gender .kml file, it was necessary to change all age-related columns from string type to double type. To do so, Refactor fields tool was used. However, there were 9 rows that I did not manage to change convert it to double type. The error was due to "ring self-intersection" of the polygons. Due to the lack of technical expertise, I decided to omit the 9 subzones from the analysis. Instead of having 323 subzones for the 2017 map, I only had 314 to work with. I had no problems working with the 2010 Singapore's residents by age group and gender data set as the required fields for aggregation were already in double variable types.
  
 
== References and data sources ==
 
== References and data sources ==

Revision as of 21:33, 15 September 2019

Part One: Thematic Mapping

Distribution of Public Education Institutions in Singapore

Q1p1.jpeg

My classification choice for this map is through the Categorized " approach, using 'mainlevel_' as the classification variable. I have chosen to differentiate the different classes through colour differentiation, with a fixed symbol size. I did not manipulate the categories within the chosen variable as the fields within the column were already well sorted out with no missing variables. I included an Open Street Map as the background for this will provide a macro understanding of the distribution of the various institution types in a quick glance.

Hierarchy of Road Network System in Singapore

Q1p2.jpeg

In the raw 'Road Section Line' dataset downloaded off data.gov, there was an absence in categorisation for road names. Using an article (link in reference [5]), I sorted Singapore's road names into five major categories. I then used the 'Categorization' approach, with my newly created variable to plot. I differentiated the different categories through colour and width of road types - with Expressway type of roads being the thickest and 'Others' road types being the thinnest. Also, I assigned brighter colours to Expressway, Major Road types such that it will be more obvious to the reader at one glance.

2014 Master Plan Landuse

Q1p3.jpeg

In plotting this polygon qualitative thematic map, the variable I used for classification is “Lu_Desc”. I did not reduce the number of categories when plotting for three reasons. One, “Lu_Desc” is a categorical not numerical variable. Unlike numerical variables, there is no need to consider about distribution. Secondly, there are no missing values within the “Lu_Desc” variable. Lastly, without understanding the context behind each category, it is very difficult to merge/concise some categories without losing the meaning behind every individual category.

Part Two: Choropleth Mapping

Singapore's Aged Population (65+) in 2010 and 2017

Q2p1.jpeg
Q2p3.jpeg

The 2010 and 2017 choropleth maps appears to have very similar spatial patterns, with the highest number of elderly population (65+) found in Tampines, Bedok, Serangoon and Bishan new towns. One alarming trend observe from both maps is that the absolute number of elderly population have increased across all new towns. Residents in Singapore appear to be living till an older age.


Proportion of Singapore's Aged Population (65+) in 2010 and 2017

Q2p2.jpeg
Q2p4.jpeg




Percentage Change of Aged Population From 2010 to 2017

Q2p5.jpeg




Reasoning behind classification choices, new derived variables, missing values and assumptions

I decided to work with Singapore's Age Population (65+) for years 2010 and 2017 as I could not find any dataset on data.gov that is for Singapore's residents by age group and gender, year 2018. The next best alternative is dataset for Singapore's residents by age group and gender in year 2017. Also, the dataset was only available in .kml extension as there was no files available for .shp or .csv extensions. The first assumption I made in this analysis is that the population data from 2017 will appear to have major similarities with the population data in 2018, hence rendering my analysis relevant.

In handling the 2017 Singapore's residents by age group and gender .kml file, it was necessary to change all age-related columns from string type to double type. To do so, Refactor fields tool was used. However, there were 9 rows that I did not manage to change convert it to double type. The error was due to "ring self-intersection" of the polygons. Due to the lack of technical expertise, I decided to omit the 9 subzones from the analysis. Instead of having 323 subzones for the 2017 map, I only had 314 to work with. I had no problems working with the 2010 Singapore's residents by age group and gender data set as the required fields for aggregation were already in double variable types.

References and data sources

1. https://www.data.gov.sg/search?q=school+information
2. https://www.data.gov.sg/dataset/master-plan-2014-land-use
3. https://www.data.gov.sg/dataset/master-plan-2014-subzone-boundary-no-sea
4. https://www.data.gov.sg/dataset/master-plan-2008-subzone-boundary-no-sea
5. https://www.remembersingapore.org/2018/08/15/singapore-street-suffixes/
6. SMT201 elearn - Week 4 Hands on Ex 4 - Coastal Outline Shapefiles from SLA
7. https://www.mytransport.sg/content/mytransport/home/dataMall/search_datasets.html?searchText=road - Road Section Line
8. https://www.data.gov.sg/dataset/singapore-residents-by-subzone-age-group-and-sex-jun-2017-gender
9. https://www.data.gov.sg/dataset/singapore-residents-by-subzone-age-group-and-sex-june-2010-gender