Difference between revisions of "Sunny Singapore"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(40 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[File:Sunny_singapore2.jpg|300px|frameless|center]]
+
[[File:Sunny Singapore logo.png|300px|frameless|center]]
  
 
<!--Header-->
 
<!--Header-->
Line 27: Line 27:
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Introduction</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Introduction</font></div>==
 
<div style="font-family:Helvetica;font-size:16px">
 
<div style="font-family:Helvetica;font-size:16px">
Singapore is a leading economy in its region, but an astonishing number of its citizens fall below the ''first-world poverty line''.
 
  
First-world poverty is a new concept to many, as it represents a group of citizens who are earning less than sufficient to cover the cost of living of their country of residence. For the fifth consecutive year, Singapore has held to its number one position as the most expensive city to live in. Although welfare is extensive in Singapore, it is definitely not exhaustive. Thus, this has become our main source of motivation for this project.
+
As the jewel of Southeast Asia, Singapore is a chart topper for many global, accredited rankings. However, these prominent awards narrowly focused on the nation’s economic development, technological infrastructure and overall prosperity. In fact, much less emphasis was placed on Singapore’s real and present problems – a struggling middle class, isolated social class and undefined first-world poverty. This passion project seeks to unearth the realities by designing an intuitive application that provides straightforward visualisations of key trends and statistics of Singapore.
 
 
We seek to develop a tool that is easy use, analyse and to act on because we strongly believe that helping our communities should not be limited to the efforts of the government. We aim to design a platform where users can recognise the less-privileged areas and understand intuitively the type of support required. As such, any citizen, committees or even organisations can utilise this resource to lend a helping hand immediately and effectively.  
 
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Problem and Motivation</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Problem and Motivation</font></div>==
 
<div style="font-family:Helvetica;font-size:16px">
 
<div style="font-family:Helvetica;font-size:16px">
To build a dashboard that allows for:
+
Despite Singapore has its own statistical government office and multiple websites such as SingStat or data.gov.sg, most of the data you can find are in the format of an Excel spreadsheet, which is very hard to understand and draw insight for the general public. Hence, we are motivated to come up with a more user-friendly visualized tool that allows everyone to instantly identify the pattern and insight about Singapore socioeconomic situation
 +
 
  
* Profiling of neighbourhoods in Singapore by attributes: income, job specification, transportation, housing and qualifications
 
* Other sub-attributes to provide for analytical context: age, race/religion, expenditure, marital status, political views
 
* Infographic on First-World Poverty
 
* General guidelines on the support type for various helpgroups
 
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Objectives</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Objectives</font></div>==
  
 
<div style="font-family:Helvetica;font-size:16px">
 
<div style="font-family:Helvetica;font-size:16px">
This project aims to provide insights into the following:
+
In this project, we are creating a visualisation dashboard that is able to discover different aspects of Singapore:
 +
* Economic situation and demographic of different planning areas
 +
* The income inequality and wealth distribution
 +
* The life standard of Singapore residents through:
 +
** Highest qualification achieved
 +
** Marital status
 +
** Choice of transportation
 +
** Accommodation situation
  
# Income data by geography
+
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Datasets</font></div>==
# Socioeconomic status by income
+
<p>
# Support types categorised by socioeconomic situation
+
These are the datasets we plan to use:
# Scalable system to incorporate future data
+
</p>
 +
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 +
|-
 +
! style="font-weight: bold;background: #FFFFFF;color:#000000;width: 25%;" | Dataset
 +
! style="font-weight: bold;background: #FFFFFF;color:#000000;" | Rationale
 +
|-
 +
| <center> Map of Planing Areas in Singapore </center> ||
 +
* A dataset containing SHP files of the administrative boundaries of Singapore
 +
* Used as a reference to digitize Singapore planning areas
 +
* https://data.gov.sg/dataset/master-plan-2014-planning-area-boundary-no-sea?resource_id=c8185fd3-3c78-48c8-94bc-a957699b4e92
 +
|-
 +
| <center> Population distribution in Singapore by age, sex and planning areas </center> ||
 +
* A dataset containing the total number of citizen by age and sex at each planning area in Singapore
 +
* Used to visualize the demographic at each planning area
 +
* https://data.gov.sg/dataset/singapore-residents-by-planning-area-subzone-age-group-and-sex
 +
|-
 +
| <center> Resident households in Singapore by household size and planning areas </center> ||
 +
* A dataset containing the total number of households in each planning area by household size
 +
* Used to visualize the population demographic and living standard in Singapore
 +
* https://data.gov.sg/dataset/resident-households-by-planning-area-and-household-size-2015
 +
|-
 +
| <center> Population in Singapore by sex, economy status and planning areas  </center> ||
 +
* A dataset containing the total number of people by economic status and sex at each planning area
 +
* Used to visualize the population demographic and living standard in Singapore
 +
* https://data.gov.sg/dataset/resident-population-aged-15-years-and-over-by-planning-area-economic-status-and-sex-2015
 +
|-
 +
| <center> Population in Singapore by sex, marital status and planning areas  </center> ||
 +
* A dataset containing the total number of people by marital status and sex at each planning area
 +
* Used to visualize the population demographic and living standard in Singapore
 +
* https://data.gov.sg/dataset/resident-population-aged-15-years-and-over-by-planning-area-marital-status-and-sex-2015
 +
|-
 +
| <center> Working residents in Singapore by industry and planning areas  </center> ||
 +
* A dataset containing the total number of people working in each industry at each each planning area
 +
* Used to visualize the population demographic
 +
* https://data.gov.sg/dataset/resident-working-persons-aged-15-years-and-over-by-planning-area-and-industry-2015
 +
|-
 +
| <center> Working residents in Singapore by monthly income and planning areas  </center> ||
 +
* A dataset containing the total number of people within each income range at each planning area
 +
* Used to visualize the population wealth distribution
 +
* https://data.gov.sg/dataset/resident-working-persons-aged-15-years-over-by-planning-area-gross-monthly-income-from-work-2015
 +
|-
 +
| <center> Working residents in Singapore by occupation and planning areas  </center> ||
 +
* A dataset containing the total number of people working in some occupation group at each planning area
 +
* Used to visualize the population socioeconomic situation
 +
* https://data.gov.sg/dataset/resident-working-persons-aged-15-years-and-over-by-planning-area-and-occupation-2015
 +
|-
 +
| <center> Highest qualification achieved by Singapore resident by planning area  </center> ||
 +
* A dataset containing the total number of people at each different qualification level at each planning area
 +
* Used to visualize the population socioeconomic situation
 +
* https://data.gov.sg/dataset/resident-population-aged-15-years-and-over-by-planning-area-and-highest-qualification-attained-2015
 +
|-
 +
| <center> Resident Households by Planning Area and Type of Dwelling  </center> ||
 +
* A dataset containing the total number of people at each type of dwelling at each planning area
 +
* Used to visualize the population socioeconomic situation
 +
* https://data.gov.sg/dataset/resident-households-by-planning-area-n-type-of-dwelling-2015
 +
|-
 +
| <center> Resident Working Persons by Planning Area and Usual Mode of Transport to Work  </center> ||
 +
* A dataset containing the total number of people with different transportation choice at each planning area
 +
* Used to visualize the population socioeconomic situation
 +
* https://data.gov.sg/dataset/resident-working-persons-aged-15-yrs-n-over-by-planning-area-n-usual-mode-of-transport-to-work-2015
 +
|-
 +
|}
 +
 
 +
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Proposed Storyboard</font></div>==
  
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Background Survey of Related Works</font></div>==
+
===#1: Introduction page===
''to be updated''
+
To provide background story, problem and motivation of this project
 +
 
 +
===#2: Economic Overview===
 +
* Economic Health
 +
** Economic Status of different planning areas
 +
** Breakdown of economic status at each planning area by genders
 +
** Dependency ratio of different planning areas
 +
* Economic Sector
 +
** Distribution of working resident in different industries
 +
** Distribution of working resident in different sectors of the service industry
 +
** Breakdown of different industries employment by planning areas
 +
** Breakdown of different sectors employment in the service industry by planning area
 +
** Distribution of working resident in different occupation groups
 +
** Breakdown of different occupation group by planning area
  
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Proposed Storyboard</font></div>==
+
===#3: Socioeconomic Overview===
 +
* Income Statistics
 +
** Wealth distribution each different planning area at different income range
 +
** Breakdown of monthly income at each planning areas
 +
* Education Statistics
 +
** Highest qualification achieved by residents at each planning areas
  
===#1: Title Screen===
+
===#4: Quality of life Overview===
The title screen indicates the project objectives that the data visualisation tool seeks to achieve on the analysis of IFC Taiwan. As the project focuses on Taiwan branches, an image of Taipei 101 was used as a landing page.
+
* Housing distribution
<br>The screens are implemented in a form of single-page website design, where each screen occupies the full screen and is navigated through scrolling action.
+
** Type of dwelling at different planning areas
 +
** Breakdown of housing type at each planning area
 +
** Distribution of different household size around Singapore
 +
** Percentage of each household size at different planning areas
 +
*Transport Trends
 +
** Breakdown of transportation choice at each planning area
 +
* Relationship
 +
**Breakdown of marital situation by genders and planning areas
  
===#2: Geographical overview===
+
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Background Survey of Related Works</font></div>==
The overview will allow the user to see all respective branches in the map. There will be an option for modes of view e.g (relative sales performance), which builds a thematic map.  Hovering or clicking on any branch will allow for a tooltip that displays the information corresponding to the mode.
 
  
===#3: Sales Overview===
+
There are multiple visualizations around the world with the goal to uncover the poverty situation in different countries. Although there are not a lot of visualizations about the situation in Singapore, we were able to found a few visualizations about the US to draw inspiration from:
This storyboard will provide visualizations for us to quickly identify top branches with high monthly sales. Upon selecting a branch, the monthly sales performance change across the years could be displayed using line graphs. It shows the overall monthly and yearly sales performance of all outlets using bar charts.
 
  
===#4: Key findings and conclusion===
+
* Median Age of US Counties in 2018 (https://www.census.gov/library/visualizations/2019/comm/median-age.html)
The key findings and conclusion page display the insights that have been gathered from the visualisation tool, which aligns with the objectives of the project. The background of the page signifies the importance of tourist attractions in the selection of new outlets, which plays a big role in maximising the yield for an outlet.
+
* List of popular graph/ visualizations (https://datavizproject.com/)
 +
* Visualizing Singapore (https://www.vslashr.com/2013/10/visualizing-singapore/)
 +
* Visualization on data.gov.sg (https://data.gov.sg/)
 +
* Singapore: Distribution by age and gender (https://viz.sg/viz/map_age_gender/)
 +
* Singapore Demographic Visualization(https://rstudio-pubs-static.s3.amazonaws.com/281811_58c03b274d7946f99f43c616726fa243.html)
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Tools and Libraries</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Tools and Libraries</font></div>==
 
<div style="font-family:Helvetica;font-size:16px">
 
<div style="font-family:Helvetica;font-size:16px">
 
*Microsoft Excel
 
*Microsoft Excel
*R Studio
+
*R Markdown
*Tableau
+
*R Shiny
 
*Google Drive
 
*Google Drive
  
 
</div>
 
</div>
  
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Datasets</font></div>==
+
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Foreseen Technical Challenges</font></div>==
<p>
+
We encountered the following technical challenges throughout the course of the project and how we overcame them.
These are the datasets we plan to use:
 
</p>
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
|-
 
! style="font-weight: bold;background: #000000;color:#fbfcfd;width: 50%;" | Dataset
 
! style="font-weight: bold;background: #000000;color:#fbfcfd;" | Rationale
 
|-
 
| <center> Administrative Boundaries, Taiwan </center> ||
 
* A dataset containing SHP files of the administrative boundaries of taiwan (county, town, village)
 
* Used as a reference to digitize IFC branch trade areas
 
|-
 
| <center> Branch location of IFC, Taiwan </center> ||
 
* A dataset containing the geographical information of each individual branch.
 
* Used as the main target of our project
 
|-
 
| <center> Point of Interests , Taiwan </center> ||
 
* A dataset containing each individual Point-Of-Interests in Taiwan (e.g. ATMs, Amusement Parks, Banks)
 
* Used as features for analysis with regards to each branch
 
|-
 
| <center> Outlets Monthly Sales Data </center> ||
 
* A dataset containing the monthly sales information of each individual branch
 
* Used to study the sales data along with the profile of each branch to generate yielding patterns (e.g. top and bottom performer)
 
|-
 
|}
 
  
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Foreseen Technical Challenges</font></div>==
+
{| class="wikitable"
We encountered the following technical challenges throughout the course of the project. We have indicated our proposed solutions, and the outcomes of the solutions.
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
 
|-
 
|-
! style="font-weight: bold;background: #000000;color:#fbfcfd;width: 33%;" | Key Technical Challenges
+
! Technical problem !! Solution
! style="font-weight: bold;background: #000000;color:#fbfcfd;width: 33%;" | Proposed Solution
 
! style="font-weight: bold;background: #000000;color:#fbfcfd;width: 33%;" | Outcome
 
 
|-
 
|-
| <center> Data is already pre-aggregated to display monthly sales  </center>
+
| Do not know how to create pie chart/ box plot using ggplot
 
||  
 
||  
*The dataset is given directly to us from IFC, and we are unable to change it. Thus, We shall utilize and do our best with the available data.
+
* Search on google how to make the plot
||
+
* Create the basic plot using R markdown to check
NA
+
* Visualize the improvement we want to make for each graph
 +
* Search on Google on how to do that
 +
* Test on the graph using R markdown before applied to R Shiny
 
|-
 
|-
| <center> Unfamiliarity in R Shiny </center>
+
| Do not know how to use reactive function to create reactive dataframe
||
 
* Watching video tutorials about R Shiny
 
* Independent learning on the design and syntax
 
* Peer learning and sharing
 
* Using Datacamp as our mentor
 
 
||
 
||
We managed to start using the packages quickly and suit our own project needs.
+
* Find multiple examples on Google find out the logic and how different people do it
Each of us work on different parts such as setting up, designing, logic and deployment.
+
* Write down how we want to the code to behave based on that logic
This speeds up our project progress.
+
* Modify the code accordingly
 
 
 
|-
 
|-
| <center> Data Cleaning & Transformation Proposed Solution </center>
+
| Do not know how to design the User Interface on R shiny
 
||  
 
||  
*Having a systematic process while working together in order to maximise efficiency e.g. taking turns to clean, transform and perform checks on the data to ensure accuracy
+
* Visualize what we want the app to look like
||
+
* Search on Google on how to make it work
The adopted process was having clear instructions issued to each member in the team, along with maintaining constant communication with each other. In the event that the dataset is deemed too dirty to be usable, it was dropped along with sourcing for new data that would be a suitable replacement.
+
* Test with different versions/ iterations until we are all satisfy with the look
 
|-
 
|-
| <center> Lack of geospatial knowledge to understand the dataset initially </center>
+
| Do not know how to deploy the app
||
 
*Attend SMT201 class to learn more, as well as reading up on resources given by Prof Kam to gain further contextual knowledge
 
 
||
 
||
NA
+
* Try to follow the instruction on shinyapp.io
|-
+
* Ask our classmates whether they got the same problem and how to deal with it
| <center> Digitising of trade areas from powerpoint slide to QGIS </center>
+
* Create a new app and copy all the code/ data/ picture over
||
 
*The process is manual and we had to put in a lot of effort to convert the drawn polygon to data points in QGIS.
 
||
 
The data points can better allow us to generate insights on the profile of each outlet via its trade area.
 
|-
 
| <center> Integrating Relevant Data from Multiple Sources Proposed Solution </center>
 
||
 
*Working together to decide on what data to extract or eliminate
 
||
 
NA
 
|-
 
| <center> Determining the Most Effective Ways in Visualising the Data </center>
 
||
 
*Gain exposure to various forms of data visualisations - revisit course materials, assess existing libraries to gain inspirations.
 
||
 
NA
 
 
|}
 
|}
 +
 +
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Data Analysis and Transformation</font></div>==
 +
 +
Most of the datasets need to followed the same data preparation steps
 +
* Rename the columns so that it makes more sense
 +
* Remove redundant columns
 +
* Remove aggregated rows/ columns
 +
* Filter/ Create additional attributed (if needed)
 +
* Capitalize the planning areas name and join with the Singapore map (for map visualization)
 +
* Aggregated the data based on chosen categories
 +
* Find the percentage
 +
* Spread/ Gather the dataset depend on the choice of visualization
 +
* Create a reactive datasets (if needed)
 +
 +
All of these steps are developed and tested in R markdown before being used in R Shiny
 +
 +
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Proposed Visualizations and Storyboard</font></div>==
 +
 +
* Chrolopleth map with filter to see the distribution of different attributes over Singapore
 +
* Stacked percent barchart to show the percentage of each attributes at each planning area
 +
* Pie chart to show the proportion of different attributes at each planning area
 +
* Boxplot to show the distribution of residents
 +
* Data table of each dataset in case the user want to find out more
 +
* Statistic Summary by R to provide addtional insight about each dataset
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Project Timeline</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Project Timeline</font></div>==
 
'''Week 8''': Complete detailed project proposal and gather datasets ''supervised by Alexia''<br/>
 
'''Week 8''': Complete detailed project proposal and gather datasets ''supervised by Alexia''<br/>
'''Week 9''': Clean datasets ''supervised by Pham''<br/>
+
'''Week 9''': Clean datasets ''supervised by Chau''<br/>
'''Week 10''': Create data visualisation & consult on quality of work ''supervised by Parth'' <br/>
+
'''Week 10''': Create data visualisation & consult on quality of work ''supervised by Chau And Parth'' <br/>
 
'''Week 11''': Finalise storyboard ''teamwork with help of professor!''<br/>
 
'''Week 11''': Finalise storyboard ''teamwork with help of professor!''<br/>
'''Week 12''': Get ready for deadlines whoop! <br/>
+
'''Week 12''': Get ready for deadlines whoop! Beautify the dashboard ''supervised by Alexia and Chau''<br/>
 +
'''Week 13''': Finalize the user guide ''supervised by Parth''<br/>
 +
'''Week 14''': Finalize the research paper and proposal ''supervised by Alexia and Parth''<br/>
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>References</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>References</font></div>==
*Tableau: https://www.tableau.com/learn/training
+
* https://shiny.rstudio.com/tutorial/
*R Shiny: https://shiny.rstudio.com/tutorial/
+
* https://shiny.rstudio.com/tutorial/written-tutorial/lesson2/
 
+
* http://t-redactyl.io/blog/2016/01/creating-plots-in-r-using-ggplot2-part-4-stacked-bar-plots.html
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Ideation Drafts</font></div>==
+
* https://plot.ly/r/pie-charts/
''To be updated''
+
* http://www.sthda.com/english/wiki/ggplot2-pie-chart-quick-start-guide-r-software-and-data-visualization
 +
* https://geocompr.robinlovelace.net/adv-map.html
 +
* https://shiny.rstudio.com/articles/html-tags.html
 +
* https://stackoverflow.com/questions/15282580/how-to-generate-a-number-of-most-distinctive-colors-in-r
 +
* https://stackoverflow.com/questions/14718203/removing-particular-character-in-a-column-in-r
 +
* https://stackoverflow.com/questions/36325154/how-to-choose-variable-to-display-in-tooltip-when-using-ggplotly
 +
* https://stackoverflow.com/questions/16184188/ggplot-facet-piechart-placing-text-in-the-middle-of-pie-chart-slices/22804400#22804400
 +
* https://stackoverflow.com/questions/21996887/embedding-image-in-shiny-app
 +
* https://shiny.rstudio.com/tutorial/written-tutorial/lesson6/
 +
* http://www.sthda.com/english/wiki/ggplot2-box-plot-quick-start-guide-r-software-and-data-visualization
  
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Comments</font></div>==
 
==<div style="background: #FCF4A3; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Helvetica"><font color= #000000>Comments</font></div>==

Latest revision as of 17:22, 24 November 2019

Sunny Singapore logo.png
 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER


Introduction

As the jewel of Southeast Asia, Singapore is a chart topper for many global, accredited rankings. However, these prominent awards narrowly focused on the nation’s economic development, technological infrastructure and overall prosperity. In fact, much less emphasis was placed on Singapore’s real and present problems – a struggling middle class, isolated social class and undefined first-world poverty. This passion project seeks to unearth the realities by designing an intuitive application that provides straightforward visualisations of key trends and statistics of Singapore.

Problem and Motivation

Despite Singapore has its own statistical government office and multiple websites such as SingStat or data.gov.sg, most of the data you can find are in the format of an Excel spreadsheet, which is very hard to understand and draw insight for the general public. Hence, we are motivated to come up with a more user-friendly visualized tool that allows everyone to instantly identify the pattern and insight about Singapore socioeconomic situation


Objectives

In this project, we are creating a visualisation dashboard that is able to discover different aspects of Singapore:

  • Economic situation and demographic of different planning areas
  • The income inequality and wealth distribution
  • The life standard of Singapore residents through:
    • Highest qualification achieved
    • Marital status
    • Choice of transportation
    • Accommodation situation

Datasets

These are the datasets we plan to use:

Dataset Rationale
Map of Planing Areas in Singapore
Population distribution in Singapore by age, sex and planning areas
Resident households in Singapore by household size and planning areas
Population in Singapore by sex, economy status and planning areas
Population in Singapore by sex, marital status and planning areas
Working residents in Singapore by industry and planning areas
Working residents in Singapore by monthly income and planning areas
Working residents in Singapore by occupation and planning areas
Highest qualification achieved by Singapore resident by planning area
Resident Households by Planning Area and Type of Dwelling
Resident Working Persons by Planning Area and Usual Mode of Transport to Work

Proposed Storyboard

#1: Introduction page

To provide background story, problem and motivation of this project

#2: Economic Overview

  • Economic Health
    • Economic Status of different planning areas
    • Breakdown of economic status at each planning area by genders
    • Dependency ratio of different planning areas
  • Economic Sector
    • Distribution of working resident in different industries
    • Distribution of working resident in different sectors of the service industry
    • Breakdown of different industries employment by planning areas
    • Breakdown of different sectors employment in the service industry by planning area
    • Distribution of working resident in different occupation groups
    • Breakdown of different occupation group by planning area

#3: Socioeconomic Overview

  • Income Statistics
    • Wealth distribution each different planning area at different income range
    • Breakdown of monthly income at each planning areas
  • Education Statistics
    • Highest qualification achieved by residents at each planning areas

#4: Quality of life Overview

  • Housing distribution
    • Type of dwelling at different planning areas
    • Breakdown of housing type at each planning area
    • Distribution of different household size around Singapore
    • Percentage of each household size at different planning areas
  • Transport Trends
    • Breakdown of transportation choice at each planning area
  • Relationship
    • Breakdown of marital situation by genders and planning areas

Background Survey of Related Works

There are multiple visualizations around the world with the goal to uncover the poverty situation in different countries. Although there are not a lot of visualizations about the situation in Singapore, we were able to found a few visualizations about the US to draw inspiration from:

Tools and Libraries

  • Microsoft Excel
  • R Markdown
  • R Shiny
  • Google Drive

Foreseen Technical Challenges

We encountered the following technical challenges throughout the course of the project and how we overcame them.

Technical problem Solution
Do not know how to create pie chart/ box plot using ggplot
  • Search on google how to make the plot
  • Create the basic plot using R markdown to check
  • Visualize the improvement we want to make for each graph
  • Search on Google on how to do that
  • Test on the graph using R markdown before applied to R Shiny
Do not know how to use reactive function to create reactive dataframe
  • Find multiple examples on Google find out the logic and how different people do it
  • Write down how we want to the code to behave based on that logic
  • Modify the code accordingly
Do not know how to design the User Interface on R shiny
  • Visualize what we want the app to look like
  • Search on Google on how to make it work
  • Test with different versions/ iterations until we are all satisfy with the look
Do not know how to deploy the app
  • Try to follow the instruction on shinyapp.io
  • Ask our classmates whether they got the same problem and how to deal with it
  • Create a new app and copy all the code/ data/ picture over

Data Analysis and Transformation

Most of the datasets need to followed the same data preparation steps

  • Rename the columns so that it makes more sense
  • Remove redundant columns
  • Remove aggregated rows/ columns
  • Filter/ Create additional attributed (if needed)
  • Capitalize the planning areas name and join with the Singapore map (for map visualization)
  • Aggregated the data based on chosen categories
  • Find the percentage
  • Spread/ Gather the dataset depend on the choice of visualization
  • Create a reactive datasets (if needed)

All of these steps are developed and tested in R markdown before being used in R Shiny

Proposed Visualizations and Storyboard

  • Chrolopleth map with filter to see the distribution of different attributes over Singapore
  • Stacked percent barchart to show the percentage of each attributes at each planning area
  • Pie chart to show the proportion of different attributes at each planning area
  • Boxplot to show the distribution of residents
  • Data table of each dataset in case the user want to find out more
  • Statistic Summary by R to provide addtional insight about each dataset

Project Timeline

Week 8: Complete detailed project proposal and gather datasets supervised by Alexia
Week 9: Clean datasets supervised by Chau
Week 10: Create data visualisation & consult on quality of work supervised by Chau And Parth
Week 11: Finalise storyboard teamwork with help of professor!
Week 12: Get ready for deadlines whoop! Beautify the dashboard supervised by Alexia and Chau
Week 13: Finalize the user guide supervised by Parth
Week 14: Finalize the research paper and proposal supervised by Alexia and Parth

References

Comments

Feel free to leave comments / suggestions!