GeoEstate PROPOSAL

From Geospatial Analytics and Applications
Jump to navigation Jump to search
GeoEstate logo.png
GeoEstate

HOME

 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER

Project Description

Our project aims provide an easy way for end users to calculate the predicted resale housing prices of apartments, condominiums and executive condominiums, using inputs such as the postal code, square area and type of apartment. To achieve this, we use 3 regression models, the geographically weighted regression model, the spatial autocorrelation regression model and the multiple linear regression model. Users can read about the methodology for each model and select the one that he or she feels best fits his or her situation, or simply look at which model fits the data points best by looking at the r square value.

Furthermore, we aim to allow for extensive exploratory data analysis, allowing users to see how various variables in the property market such as the age of the property and square feet of the property correlate to property resale prices. Through this, we aim to educate interested consumers and real estate agents alike on what truly matters when determining the price of real estate resale property.


Project Motivation

How do you know if you are getting a reasonable price for your apartment? Due to vested interests, for people who are interested in being educated consumers, taking your Real Estate Agent's word for the price of a property may not be enough. In our current age, websites like PropertyGuru appear to give us some semblance of what prices are competitive. However, this may be misleading as it only is a snapshot in time.

What if you were able to predict the price of the property you want to sell, or conversely, the dream property you wish to purchase, using masses of data accumulated over past years?

Through our application, we aim to educate consumers on the value of property using rigorous statistical methods.


Storyboard

StoryBoard1.png StoryBoard2.png StoryBoard3.png StoryBoard4.png



Data sources
Data Source Data Type/Method
2014 Master Plan Planning Subzone (Web) Data.gov.sg SHP
URA Private Residential Property Transactions Ura.gov.sg

CSV
Data was geocoded using Google Geocoding API
Postal code was geocoded using OneMap API

Pre-School Locations Data.gov.sg KML
Converted to Shapefile
Primary/Secondary School Locations Data.gov.sg CSV
Data was geocoded using OneMap API
MRT/LRT Station Locations LTA Datamall
(Direct Download)
SHP
Supermarket Locations Data.gov.sg KML
Converted to Shapefile
Shopping Mall Locations Wikipedia Text
Data was converted to Shapefile after geocoding using OneMap API
Park Locations Data.gov.sg KML
Converted to Shapefile
Sports Facilities Locations Data.gov.sg KML
Converted to Shapefile
Hawker Centre Locations

Public Food Centres:
1. Data.gov.sg

Private Food Centres:
2. Kopitam
3. Koufu
4. Food Junction
5. Food Republic

1: KML - Converted to Shapefile
2 - 5: Text - Data scraped from sites and geocoded using OneMap API

Data Transformation



Literature Review

1. A Spatial Analysis of House Prices in the Kingdom of Fife, Scotland

(By: Julia Zmölnig, Melanie N Tomintz, Stewart A Fotheringham)

GeoEstate interpolation.jpg

Aim of Study: to analyse the spatial variations in house price adjustments due to economic conditions, and to quantify and describe patterns in the variations of house prices in the study area of Fife, Scotland

Methodology:

Spatial Interpolation Technique - using points with known values to estimate values at other unknown points. There were 3 main methods being used:

  • Diffusion Interpolation with Boundaries
  • Inverse-distance weighting
  • Deterministic ordinary Kriging (Most accurate)

Learning Points:

  • House price hot spot will migrate from year to year and multiple models is required if the study duration spans over multiple years
  • Economic downturn actually leads to increase of property prices despite more supply from unemployed people

Areas for Improvement:

  • Data lacked information such as the size and type of real estates which while could be approximated via interpolation, overall still hurts the accuracy of the model
  • Using a different model such Geographically Weighted Regression (GWR) to identify spatial patterns apparent in the study area.


2. Statistical analysis of the relationship between public transport accessibility and flat prices in Riga

(By: Dmitry Pavlyuk)

GeoEstate public transport accessibility.JPG

Aim of Study: to examine the relationship between public transport accessibility and residential land value in Riga, Latvia

Methodology:

  • Geographically Weighted Regression (GWR)
  • Global Regression Model

Learning Points:

  • Within city centre, accessibility has no significant relationship on flat prices as the city centres are already rich in transport route and new routes have a diminishing impact
  • For the population with higher income, higher public transport accessibility will possibility lead to lower property prices
  • Overall GWR performed significantly better than global regression
    • Variable that have no significant relation in one model might be significant in another. For example, the influence of the first floor on the price was insignificant in the global regression model, it was a local dependency in GWR.

Areas for Improvement:

  • Overall limited impact by transport which was the main focus of the study
  • Possibility of using Manhattan distance to compute the actual distance travelled rather than straight line distance


Approach
  • Geographically Weighted Regression (GWR)


Project Prototype



Tools & Technology
GeoEstate tech stack.png
Project Timeline
GeoEstate timeline.jpg
Challenges


No. Key Challenges Mitigation
1. Unfamiliarity with R, its packages and R Shiny
  1. Self-directed learning with online resources such as Datacamp,
  2. Browsing community forum (Stackoverflow / discuss.onemap) for help
  3. Looking at official documentation for various packages
2. Limited oneMap API call for standard account
  1. Creation of R script to catch timeout & wait
  2. Filtering out distinct records to query oneMap to reduce the quantity of duplicated request