Difference between revisions of "Group21 Research Paper"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 97: Line 97:
 
<br> <b> 2. Interactive Tree Map: </b>  
 
<br> <b> 2. Interactive Tree Map: </b>  
 
The interactive tree map allows the user to view the resale prices and floor area across different levels. This gives the users the chance to look at the resale price trends in-depth. The user can view the treemap at the Planning region level. The user can then drill down to view the treemap at the Town level and Street level. The visualization also has a data table that summarizes the values of important parameters. The treemap size is based on the number of transactions and the color is based on the resale price. This visualization is designed to give the users an opportunity to drill down and compare resale price trends across respective levels, ie, compare towns with other towns or street with other streets within a town. The visualization was build using d3treeR package in R. Other packages like DT and dplyr are used to render data table and build interactivity to the treemap.  
 
The interactive tree map allows the user to view the resale prices and floor area across different levels. This gives the users the chance to look at the resale price trends in-depth. The user can view the treemap at the Planning region level. The user can then drill down to view the treemap at the Town level and Street level. The visualization also has a data table that summarizes the values of important parameters. The treemap size is based on the number of transactions and the color is based on the resale price. This visualization is designed to give the users an opportunity to drill down and compare resale price trends across respective levels, ie, compare towns with other towns or street with other streets within a town. The visualization was build using d3treeR package in R. Other packages like DT and dplyr are used to render data table and build interactivity to the treemap.  
 +
 +
 
[[Image: treemapz.JPG|500px|left|border]]
 
[[Image: treemapz.JPG|500px|left|border]]
  
Line 116: Line 118:
  
 
<br> <b> 3. Detailed Analysis of HDB Blocks: </b>
 
<br> <b> 3. Detailed Analysis of HDB Blocks: </b>
Based on the input from the user, this tab renders a box plot and a leaflet map. The user is allowed to select a Region, Town, flat model and flat type. The Box Plot (Fig 6.3.1) shows the plot for Resale Price by Flat Type. This could be used for basic statistical analysis. This allows the user to analyze the minimum resale price and maximum resale price and identify the outliers in the dataset. On the other hand, the leaflet map shows the location of different HDBs based on the user input. It allows the user to view the transaction details of each HDB on click. The visualization was created using the leaflet, ggplot2, DT, plotly, dplyr, viridisLite and other packages in R. This visualization would aid the user to get the transaction level summary of HDB blocks based on the user input. The user can then use the visualization to analyze which HDB could be a good investment decision based on the price trends.
+
Based on the input from the user, this tab renders a box plot and a leaflet map. The user can select a Region, Town, flat model and flat type. The Box Plot shows the plot for Resale Price by Flat Type. This could be used for basic statistical analysis. This allows the user to analyze the minimum resale price and maximum resale price and identify the outliers in the dataset. On the other hand, the leaflet map shows the location of different HDBs based on the user input. It allows the user to view the transaction details of each HDB on click. The visualization was created using the leaflet, ggplot2, DT, plotly, dplyr, viridisLite and other packages in R. This visualization would aid the user to get the transaction level summary of HDB blocks based on the user input. The user can then use the visualization to analyze which HDB could be a good investment decision based on the price trends.  
 +
 
 
[[Image: detailedanalysise.JPG|500px|left|border]]
 
[[Image: detailedanalysise.JPG|500px|left|border]]
  

Revision as of 23:20, 13 August 2018

                 MOVE TO WHAT MOVES YOU!


Proposal

Poster

Application

Analysis Report

Homepage

 


Data Cleaning, Preparation and Modelling

The data for 12 months (June 2017 to March 2018) that covers the resale prices, nature and characteristics of HDBs is taken from analysis.
Checking for Missing Values: Excel is used to check for the missing value patterns in the dataset. From the results of missing value pattern in the dataset, we observe that the dataset has no missing value patterns.
Adding New Columns: The year, month and quarter field was extracted from the “Month” column of the dataset. Using the data from the Singapore government website, the Planning region for each area is added to the dataset.
Preparing the Geolocation Data:
1. The concatenate function in Excel was used to derive the Address of each block in the dataset. The block number, street name was concatenated together, and “Singapore” was added to the end of the entry to derive the full address of each block.
2. Since the postal code was not available in the dataset, the coordinates were derived using the address of each block. To do the following, the “ggmap” package was used in R. The geocode function in ggmap package was used to derive the latitude and longitude coordinates of each block based on the address. The API could only run 2500 entries at a time and hence subsets of each month of the dataset were created and the code was run.

Geocode.JPG








3. After running the code, it was observed that several rows had “NA” values in the latitude and longitude columns. These “NA” values were then extracted into a separate excel sheet and the code was re run. After running the code multiple times, there were still some “NA” values. The latitude and longitude coordinates for these blocks were manually derived.
4. For a few blocks, the values generated for latitude and longitude coordinates were reversed. The values were manually swapped to get the accurate coordinates for each block.

Visualizations

The dashboard is laid out in 3 separate tabs.

Dash.JPG


The dashboard shows the visualizations at different levels. The visualizations are shown at the following levels:
Level 1: Planning Region
Level 2: Towns
Level 3: Streets

The dashboard shows the Market Overview, Summary statistics at Planning Region, Town and Street level and detailed analysis of HDB transactions.
1. Geofacet Plot: The Geofacet plot shows the resale prices of HDB flats based on the new town planning. There is a total of 26 towns and each town accounts for each facet. The visualization shows the month wise trend of resale prices. Based on the location of a town on the Singapore map, each town is mapped on the Geofacet plot as an individual facet. From this plot, the user can examine the fluctuations in Resale prices across a town and compare it to other neighboring towns. The geofacet and ggplot2 package was used in R to create this geofacet plot. This visualization helps the user to observe the resale price trend with a view of the original geographic topology as closely as possible. From the Geofacet plot, we observe that Bukit Timah, Central Area and Marine Parade show the highest volatility in prices.

Geofacet.JPG












2. Interactive Tree Map: The interactive tree map allows the user to view the resale prices and floor area across different levels. This gives the users the chance to look at the resale price trends in-depth. The user can view the treemap at the Planning region level. The user can then drill down to view the treemap at the Town level and Street level. The visualization also has a data table that summarizes the values of important parameters. The treemap size is based on the number of transactions and the color is based on the resale price. This visualization is designed to give the users an opportunity to drill down and compare resale price trends across respective levels, ie, compare towns with other towns or street with other streets within a town. The visualization was build using d3treeR package in R. Other packages like DT and dplyr are used to render data table and build interactivity to the treemap.


Treemapz.JPG










3. Detailed Analysis of HDB Blocks: Based on the input from the user, this tab renders a box plot and a leaflet map. The user can select a Region, Town, flat model and flat type. The Box Plot shows the plot for Resale Price by Flat Type. This could be used for basic statistical analysis. This allows the user to analyze the minimum resale price and maximum resale price and identify the outliers in the dataset. On the other hand, the leaflet map shows the location of different HDBs based on the user input. It allows the user to view the transaction details of each HDB on click. The visualization was created using the leaflet, ggplot2, DT, plotly, dplyr, viridisLite and other packages in R. This visualization would aid the user to get the transaction level summary of HDB blocks based on the user input. The user can then use the visualization to analyze which HDB could be a good investment decision based on the price trends.

Detailedanalysise.JPG










Future Work

Given time constraints, the current application is only able to showcase the location of HDB flats based on the user input and resale prices of the flats at different levels. However, we believe that the dashboard can be enhanced further to include the following:
1. Price Estimator- Using regression models and forecasting methods, the dashboard can be enhanced further to build a price estimator which provides the users with the future price trends and estimated price of a HDB in the future based on the current market trend.
2. User Interface with more options- The data can also be prepared to include amenities and other key locations such as MRTs, Schools, shopping malls, etc so that the users are given an option to select their desired HDB based on proximity to these key locations.

Demonstration: Sample Test Cases

The purpose of this section is to provide important demonstrative examples of the usage of this application. Some important use cases for the application are as follows:
1. Detailed Analysis of HDB Resale Transaction
The first use case demonstrates the resale price trend in Woodlands region for Maisonette flat model and Executive flat type. The users can see the minimum and maximum resale price transactions and the locations of HDBs based on the transactions. The second use case demonstrates the resale price trend for Woodlands town for improved flat model and 3 room flat type.


Detailedanalysis.JPG

Detailedanalysis2.JPG


Installation Guide
1. The user can explore GRIT application to view the application on HDB Resale Price trends.
Click here to explore the live application.


2. Installation Process - System Requirements
Due to the packages used in this application, the minimum version for RStudio is 'Version 1.0.143'. You can install R studio:
Click here Click here to download the latest version of RStudio.


3. Deployment Process
You can host the R Shiny application on your own server. A free server to host this application is provided by shinyapps.io. Steps to deploy the application on shinyapps.io are as follows:

  • Visit Shiny Apps and sign up for a free account, which allows you to host up to 5 applications.
  • Based on the personal token and secret number provided by Shiny Apps after signing up, the application can then be deployed to your server.
  • In Rstudio, execute the following code snippet:

install.packages('devtools')

devtools::install_github('rstudio/shinyapps')

devtools::install_github('rstudio/rsconnect')

library(shinyapps)

library(rsconnect)

rsconnect::setAccountInfo(name="Your account name", token="Your Personal Token", secret="Your Secret Number")


Click on the Publish icon in RStudio to upload and deploy the application to the Shiny Apps Server.
After entering an appropriate name for your application, the GRIT application will be deployed and hosted to your Shiny Apps server.