NeighbourhoodWatchDocs Proposal

From Geospatial Analytics and Applications
Jump to navigation Jump to search


HOME

PROPOSAL

POSTER

APPLICATION

RESEARCH PAPER



PROJECT DESCRIPTION
Our project aims to make use of geospatial intelligence to explore the potential of allocating nearby doctors within estates to the residents, in particular the elderly to combat the issues of an ageing population.

Between 2000 and 2018, Singapore’s population grew from 4.028 million to 5.791 million. However, the number of citizens aged 65 and above is increasing rapidly, as population growth slows. The size of this group of citizens grew by more than 2 times from 220,000 in year 2000 to 547,900 in year 2018, and is expected to increase even more by 2030. As the government look towards new ways in providing better social welfare, we aim to explore the possibility of allocating neighbourhood clinic doctors to nearby HDB blocks to ensure that those with mobility issues or disabilities receives adequate healthcare. This is similar to the home care methods that Japan has been adopting for quite some time [1], which has been found to reduce overall healthcare costs for the country in the long run. With an increase in budget for healthcare infrastructure such as building at least one new hospital, for example, Sengkang General and Community Hospitals in 2018, Outram Community Hospital in 2020, Woodlands Health Campus in 2024, we aim to explore the possibility and feasibility of our project, providing a better analysis and insight with regards to how well general practitioner (GP) and aging population distribution match up. With the improved analysis on the distribution match up, we will be able to better understand the impact of current and future healthcare policies, with GP being the first line of healthcare provided for the elderly, and better supporting decision making.


PROJECT MOTIVATION
Currently, there are no applications that combines the features of spatial accessibility and spatial location allocation models which are essential to the use of urban planning. There are also no applications which applies the location-allocation model that takes into consideration of the resource constraints at supply and demand points. Our application provides an easy to understand and user-friendly GUI to resolve the problems mentioned previously. Lastly, our app also provides convenience over existing commercial applications as there is no need for installations and is easily accessible via the Internet.


PROJECT OBJECTIVES
Our goals are :
  • To build a GIS tool (an R Shiny app)
  • Analyse the demand and supply of clinics using the proximity of each clinic in residential zones
  • Evaluate results of analysis and provide recommendations to further improve the social welfare for the residents who require special assistance


DATA COLLECTION
To achieve our project objectives, it is necessary for us to obtain the datasets that is available online for use. The following table depicts the list of datasets we require and how we can obtain them:

No.

Dataset

Description

Source(s)

1.

OSM Layer (Singapore)

This dataset is necessary for us to be able to plot the Singapore map.

OpenStreetMap

2.

Singapore Planning Subzone (MP14_SUBZONE_WEB_PL)

This dataset is necessary for us to be able to plot the Singapore map out at a planning subzone level.

https://data.gov.sg/dataset/master-plan-2014-subzone-boundary-no-sea

3.

Singapore Residents by Subzone, Age Group and Sex, June 2017 (Gender)

This dataset is necessary for us to find out the total number of elderly population.

From Take-home_Ex01

4.

Total Number of Clinics in Singapore

This dataset is necessary for us to locate all the neighborhood clinics in Singapore and find out their respective addresses, so as to aid in our proximity analysis with the dwellings.

As this data is not readily available to us, we will need to perform web scraping to "scrape" the data off the Singapore Healthcare Institutions Directory (HCI) website. A Python script can help us achieve the following and more details can be found below.

http://hcidirectory.sg/hcidirectory/

5.

Total Number of TCM Clinics in Singapore

This dataset is necessary for us to locate all the TCM Medical Centres in Singapore and find out their respective addresses, so as to aid in our proximity analysis with the dwellings.

As this data is not readily available to us, we will need to perform web scraping to "scrape" the data off the Singapore Traditional Chinese Medicine Practitioners Board (TCM Board) website. Code tweaks to the Python script used for the HCI website data will help us collect this dataset successfully.

https://prs.moh.gov.sg/prs/internet/profSearch/main.action?hpe=TCM

6.

Number of HDB blocks per planning subzone

This dataset is necessary for us to find out the total number of HDB blocks per subzone.

https://data.gov.sg/dataset/hdb-property-information

7.

Residents by Age Group & Type of Dwelling, Annual

This dataset is necessary for us to locate the number of population aged 65 and above, staying in the different types of dwelling.

https://data.gov.sg/dataset/residents-by-age-group-type-of-dwelling-annual

Code Snippet 1 Code Snippet 2
NeighbourhoodWatchDocs Scraper1.png
NeighbourhoodWatchDocs Scraper2.png
The code snippets above shows how we can perform web scraping using a Python script. The script mainly does the following in a nutshell:

1. Uses a set of pre-defined parameters and settings to visit through each page of the HCI's search results in a loop.
2. Make use of XPATH expressions to retrieve the content (healthcare facility name, address, postal code) located at specific HTML attributes.
3. When all pages are visited, the retrieved information will be parsed and stored into a .CSV file with the respective columns.


DATA TRANSFORMATION

To perform the assess if existing healthcare amenities or elderly care amenities are meeting the needs of Singaporeans age above 65 living in public housing provided by the Housing & Development Board (HDB), we would require 3 sets of data:
1.Data on supply, as defined by the location of GP clinics.
2.Data on demand, as defined by census data, that could give us an understanding of the number of potential beneficiaries of GP services, living in public housing.
3.Data on the location of the source of demand,which we would define as the location of public housing flats, where beneficiaries live.

No.

Dataset

Data to filter

1.

OSM Layer (Singapore)

-

2.

Singapore Planning Subzone (MP14_SUBZONE_WEB_PL)

-

3.

Singapore Residents by Subzone, Age Group and Sex, June 2017 (Gender)

Financial Year 2017 and Age Group >= 65

4.

Total Number of Clinics in Singapore

-

5.

Total Number of TCM Clinics in Singapore

-

6.

Number of HDB blocks per planning subzone

-


7.

Residents by Age Group & Type of Dwelling, Annual

Financial Year 2017


LITERATURE REVIEW & PROJECT APPROACH

Use GIS techniques to narrow down scope of where to build potential clinics
Pattern Visualization

Firstly, we will establish the point locations of the residential areas as well as the clinics islandwide. This is to obtain spatial information at a glance about the elderly population within a certain area. We will be looking at values on a subzone level. Among the data, elderly population within the residential population needs to be clean and calculated first for a more accurate analysis.

Buffer Analysis

A buffer zone is any area that serves the purpose of keeping real world features distant from one another. For the purpose of our project, we could look at the buffer to be in terms of minutes of walk away from the elderly's blocks to the nearest clinics, instead of distance. With this, we can determine the catchment areas of existing clinics. This is by making clinics the centre of the circle to calculate the catchment area within a given radius. Users interacting with the map we created will select from a list of maximum allowable timing that the elderly is willing to commute to the nearest clinic, and that will be translated into distance for the purpose of buffer analysis.

Kernel Density Estimation

Determine the distribution of elderly within subzones and analyse the output in comparison with the number of nearby clinics.
One potential method is the usage of Network KDE. The difference from auto k function is that k function deals with set of points of a single kind (e.g. just clinics) and considers the shortest path within these points, while NKDE deals with two sets of points of different kinds (e.g. clinics and residential properties) and considers the shortest path between these two points.

The Monte Carlo Simulation method is often use to test the distribution pattern of point events, and whether these points are uniformly and independently distributed over a network depends on the differences between the K-function values and the completely spatial random (CSR) point pattern test. If K(l) is above the upper CSR bound, the point set P is in a cluster distribution. If K(l) is below the lower CSR bound, the point set P is in a dispersion distribution. KDE should look at 100 – 300m bandwidth for study of urban economic activities.

NeighbourhoodWatchDocs bufferandKDE.jpg
[3]

Furthermore, we can make use of the SpatialAcc package in R to calculate the accessibility from the residents to the clinics. An example below from a similar study shows that the areas with darker shades of circles shows better accessibility between 2 different types of facilities[4].

NeighbourhoodWatchDocs spatialacc.jpg


Overlay analysis

From the elderly residential population, divide the density into 10 categories represented by values 1 – 10. Overlay existing land use raster map, existing clinics raster map and elderly population density raster map.

Then use grid analysis to identify candidate locations for clinics.

NeighbourhoodWatchDocs reclassificationmodel.jpg

Based on reclassification, the greater the value obtained from the calculated results, the higher the suitability of the area as a candidate. Final calculated values greater than 0 indicate that the candidate points meet the above requirements[5].

Location allocation model
P-median model

P-median model aims to determine the locations of P facilities such that the total travel distance from each demanding site to the closest facilities is minimized[6]. Another point is the P-median model is focused on objective function with a maximum coverage or on assignment strategy with gravity effect. For instance, studies have been conducted in private facilities to determine optimal locations of warehouses. In the case of our project, we will be using this model to determine the optimal locations to recommend the building of clinics, if any. The assumption for this model is that a facility located at this node and respond to all demands originating at the node.

Backup Coverage model

The BCM is a spinoff from the Double Standard Model, which aims to allocate facilities among potential sites to provide the full coverage within a longer distance standard while maximizing the coverage within a shorter distance standard. The BCM (BACOP1 and BACOP2) maximize the population coverage with more than two facilities while forcing all demand points to be covered once[7].


Refernces
[1] Japan tries to keep the elderly out of hospital. (2019). The Economist. Retrieved 7 April 2019, from https://www.economist.com/asia/2019/01/12/japan-tries-to-keep-the-elderly-out-of-hospital

[2]Ni, J., Qian, T., Xi, C., Rui, Y., & Wang, J. (2016). Spatial Distribution Characteristics of Healthcare Facilities in Nanjing: Network Point Pattern Analysis and Correlation Analysis. International Journal Of Environmental Research And Public Health, 13(8), 833. doi:10.3390/ijerph13080833

[3] Gu, T., Li, L., & Li, D. (2018). A two-stage spatial allocation model for elderly healthcare facilities in large-scale affordable housing communities: a case study in Nanjing City. International Journal For Equity In Health, 17(1). doi:10.1186/s12939-018-0898-6

[4] (2019). Int-arch-photogramm-remote-sens-spatial-inf-sci.net. Retrieved 10 March 2019, from https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-4-W2/91/2017/isprs-archives-XLII-4-W2-91-2017.pdf

[5] Challenges and Solutions for Location of Healthcare Facilities. (2019). Omicsonline.org. Retrieved 10 March 2019, from https://www.omicsonline.org/open-access/challenges-and-solutions-for-location-of-healthcare-facilities-2169-0316.1000127.pdf

[6] Jia, T., Tao, H., Qin, K., Wang, Y., Liu, C., & Gao, Q. (2014). Selecting the optimal healthcare centers with a modified P-median model: a visual analytic perspective. International Journal Of Health Geographics, 13(1), 42. doi:10.1186/1476-072x-13-42

[7] Polo, G., Acosta, C., Ferreira, F., & Dias, R. (2015). Location-Allocation and Accessibility Models for Improving the Spatial Planning of Public Health Services. PLOS ONE, 10(3), e0119190. doi:10.1371/journal.pone.0119190


PROJECT TIMELINE
NeighbourhoodWatchDocs Timeline.jpg


STORYBOARD

Our team aims to create a dashboard that will display all the clinics (medical and TCM) across Singapore. We will also display dwellings in the respective subzones and the estimated number of elderly aged 65 and above staying there. Just the above will show the supply and demand for healthcare in clinics. Further analysis will be calculated on how the supply and demand are met, as well as to identify any potential gaps e.g. overdemand or oversupply of healthcare facilities in a particular subzone.

The supply and demand data for selected subzones will then be displayed on a data table for better visualization.

NeighbourhoodWatchDocs storyboard.jpg


TOOLS & TECHNOLOGY

These are the tools and technology our team aims to explore and use through the period of the project to achieve our objectives. It will be updated as we go through the project.

Neighbourhood WatchDocs Technologies.JPG


PROJECT CHALLENGES

No.

Key Technical Challenges

Description

Proposed Solution

Outcome

1.

Unfamiliarity with R packages and R Shiny

Our team may encounter the use of additional R resources that were not taught in class.

- Independent Learning on R packages and R Shiny
- Browsing the official RDocumentation website for support and reference
- Research for online tutorials that have a specific use case for certain R packages

We managed to solve the mentioned challenge with the following resources:
- Datacamp tutorial on R Shiny App.
- Research on case studies utilizing R Shiny for geovisualization.

2.

Data Cleaning and Transformation

As we need to collect the data from various sources, they may have different attributes such as the Coordinate Reference System (CRS), units of measurement and etc.

Adopt a standardized process of cleaning the data, focusing with what we only need. Most of the datasets used for our project can be found in our Hands-On or Take-Home exercises and we can rely on those existing data. There are certain assumptions that we need to make based on the context and purpose of our project, such as the average number of doctors in a particular clinic, which cannot be derived from our datasets.

We managed to solve our technical challenges with the following:
- Reference case studies and past literature reviews.
- Estimated using proportionality functions when parts of the dataset needed is unavailable.

3.

Limitations & Constraints in existing R packages (tbart)

As the current allocations() method did not take into consideration the population within each block and the capacity of each clinic, clinics are only allocated based on nearest distance from block.

Working out with the team together and figuring out a reasonable and valid algorithm, together with adequate online research and consultation with Prof. Kam.

We managed to solve our technical challenge with the following:
- Build on and modified allocation algorithm to suit our needs for the analysis
- For the above algorithm to work, we generated individual demand points by the elderly count per block. For instance, if there are 30 elderly within Blk 5 Beach Road, there would be 30 points populated within that block area. If the resource capacity of a clinic is 80, only 80 points would be assigned to the clinic area.