GeoEstate PROPOSAL
Project Description |
Our project aims provide an easy way for end users to calculate the predicted resale housing prices of apartments, condominiums and executive condominiums, using inputs such as the postal code, square area and type of apartment. To achieve this, we use 3 regression models, the geographically weighted regression model, the spatial autocorrelation regression model and the multiple linear regression model. Users can read about the methodology for each model and select the one that he or she feels best fits his or her situation, or simply look at which model fits the data points best by looking at the r square value.
Furthermore, we aim to allow for extensive exploratory data analysis, allowing users to see how various variables in the property market such as the age of the property and square feet of the property correlate to property resale prices. Through this, we aim to educate interested consumers and real estate agents alike on what truly matters when determining the price of real estate resale property.
Project Motivation |
How do you know if you are getting a reasonable price for your apartment? Due to vested interests, for people who are interested in being educated consumers, taking your Real Estate Agent's word for the price of a property may not be enough. In our current age, websites like PropertyGuru appear to give us some semblance of what prices are competitive. However, this may be misleading as it only is a snapshot in time.
What if you were able to predict the price of the property you want to sell, or conversely, the dream property you wish to purchase, using masses of data accumulated over past years?
Through our application, we aim to educate consumers on the value of property using rigorous statistical methods.
Storyboard |
Data sources |
Data | Source | Data Type/Method |
---|---|---|
2014 Master Plan Planning Subzone (Web) | Data.gov.sg | SHP |
URA Private Residential Property Transactions | Ura.gov.sg |
CSV |
Pre-School Locations | Data.gov.sg | KML Converted to Shapefile |
Primary/Secondary School Locations | Data.gov.sg | CSV Data was geocoded using OneMap API |
MRT/LRT Station Locations | LTA Datamall (Direct Download) |
SHP |
Supermarket Locations | Data.gov.sg | KML Converted to Shapefile |
Shopping Mall Locations | Wikipedia | Text Data was converted to Shapefile after geocoding using OneMap API |
Park Locations | Data.gov.sg | KML Converted to Shapefile |
Sports Facilities Locations | Data.gov.sg | KML Converted to Shapefile |
Hawker Centre Locations |
Public Food Centres: |
1: KML - Converted to Shapefile |
Data Transformation |
Literature Review |
1. A Spatial Analysis of House Prices in the Kingdom of Fife, Scotland
(By: Julia Zmölnig, Melanie N Tomintz, Stewart A Fotheringham)
Aim of Study: to analyse the spatial variations in house price adjustments due to economic conditions, and to quantify and describe patterns in the variations of house prices in the study area of Fife, Scotland
Methodology:
Spatial Interpolation Technique - using points with known values to estimate values at other unknown points. There were 3 main methods being used:
- Diffusion Interpolation with Boundaries
- Inverse-distance weighting
- Deterministic ordinary Kriging (Most accurate)
Learning Points:
- House price hot spot will migrate from year to year and multiple models is required if the study duration spans over multiple years
- Economic downturn actually leads to increase of property prices despite more supply from unemployed people
Areas for Improvement:
- Data lacked information such as the size and type of real estates which while could be approximated via interpolation, overall still hurts the accuracy of the model
- Using a different model such Geographically Weighted Regression (GWR) to identify spatial patterns apparent in the study area.
2. Statistical analysis of the relationship between public transport accessibility and flat prices in Riga
(By: Dmitry Pavlyuk)
Aim of Study: to examine the relationship between public transport accessibility and residential land value in Riga, Latvia
Methodology:
- Geographically Weighted Regression (GWR)
- Global Regression Model
Learning Points:
- Within city centre, accessibility has no significant relationship on flat prices as the city centres are already rich in transport route and new routes have a diminishing impact
- For the population with higher income, higher public transport accessibility will possibility lead to lower property prices
- Overall GWR performed significantly better than global regression
- Variable that have no significant relation in one model might be significant in another. For example, the influence of the first floor on the price was insignificant in the global regression model, it was a local dependency in GWR.
Areas for Improvement:
- Overall limited impact by transport which was the main focus of the study
- Possibility of using Manhattan distance to compute the actual distance travelled rather than straight line distance
Approach |
- Geographically Weighted Regression (GWR)
Project Prototype |
Tools & Technology |
Project Timeline |
Challenges |
No. | Key Challenges | Mitigation |
---|---|---|
1. | Unfamiliarity with R, its packages and R Shiny |
|
2. | Limited oneMap API call for standard account |
|