Group02 proposal v2

From Visual Analytics for Business Intelligence
Revision as of 16:06, 5 March 2020 by Mingyu.chua.2017 (talk | contribs) (Created page with "center|300px <!--Header--> <div style="width:100%; text-align:center;"> {|style="background-color:#143c67; color:#4d79ff; padding: 10 0 10 0;" width...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Rain & Shine.png

Team

 

Proposal

 

Poster

 

Application

 

Research Paper

Version1Version2



PROBLEM & MOTIVATION

Problem
When it comes to purchasing or renting a property, there are many factors that go into a buyer’s consideration before he makes the final decision. The primary concern for buyers is the pricing of the property [1]. However, our group identified that there are also secondary concerns such as the weather and the amenities available that do influence the buyer’s final decision to purchase the property. There are limited tools available to help property buyers to identify areas that suit their needs/preferences best. The current tools that are available are only optimal to suit one category of concern, but fails when we try to use more categories to make our visualization.

For example, different people have different preferences for the weather. Some may prefer sunny weather while others prefer it to rain all the time. Currently, there are tools to find weather information, and property prices information, but not both at the same time.

Motivation
Through our visualization, the group hopes to assist users to be able to easily visualize a home of his dreams in different aspects beyond mere prices. Our visualization will incorporate data from property prices, weather and amenities to identify the accessibility of the area and help our users to justify the real value of the property. With our visualization dashboard, users will be able to identify properties that they truly desire.

OBJECTIVES

We aim to provide an interactive visualization dashboard to assist property seekers with identifying the housing area that best suits their needs with visualization information such as:

  1. Insights on the weather (which consists of the Rainfall Precipitation, Temperature and Wind speed) of each postal area.
  2. Insights on the available amenities within the proximity of the selected geo boundary.
  3. Insights on the distribution of commercial and residential prices for each postal area over the past few years.

Target Group:

  • Property buyers with weather preferences
  • Commerical buyers who wants to open a shop

DATASET

The Data Sets we will be using for our analysis and for our application is listed below:

Data/Source Variables/Description Rationale Of Usage

Temperature and Rainfall Data
(Jan 2012 - Dec 2019)

(http://www.weather.gov.sg/climate-historical-daily/)

  • Stations
  • Date
  • Daily Rainfall
  • Highest 30-min/60-min/120-min Rainfall (mm)
  • Mean/Minimum/Maximum Temperature (°C)
  • Mean/Max Wind Speed (km/h)

This dataset covers a good time series of Singapore's weather from 2012 to 2019 across different weather categories. Our team wish to spot the trend or pattern of Singapore's climate in every town if possible.

Amenities Location Data

(https://api.data.gov.sg/v1/environment/rainfall) (https://api.data.gov.sg/v1/environment/air-temperature)

  • Station ID
  • Station Name
  • Latitude
  • Longitude

The data set will be used to anchor the amenities available for the selected property in a specified range

Commercial Data and Residential Data
(Jan 2012 - Dec 2019)

(https://spring-ura-gov-sg.libproxy.smu.edu.sg/lad/ore/property_market/index.cfm)

  • Project Name
  • Address
  • No. of Units
  • Area (sqm)
  • Type of Area
  • Transacted Price ($)
  • Nett Price($)
  • Unit Price ($ psm)
  • Unit Price ($ psf)
  • Sale Date
  • Property Type
  • Tenure
  • Completion Date
  • Type of Sale
  • Purchaser Address Indicator
  • Postal District
  • Postal Sector
  • Postal Code
  • Planning Region
  • Planning Area

This dataset covers a good time series from 2012 to 2019 and the breakdown by subzone/planning area and postal code to visualize in a map. These transaction data will be used together with the weather data to explore potential relationships. We will also use the amenities data to identify the accessibility of the property to help our users to justify if the property worth its price.

BACKGROUND SURVEY OF RELATED WORK

Below are a few visualizations and charts we considered making for our projects.

Visual Considerations Insights / Comments

Title: Qualitative Thematic Map
ThematicMap.png

Source: https://mapdesign.icaci.org/2014/12/mapcarte-353365-life-in-los-angeles-by-eugene-turner-1977/

One of the items that we looked at is this qualitative thematic map that was covered in class.

From our initial brainstorming of ideas, we intend to look at various factors that a buyer will look at, giving the buyer a high level overview of the different areas and whether it fits the criterias that he chooses. How we will adapt ideas from this graph is for us to allow for the users to make a few selections of multiple factors. Then based on which criterias the different properties in the different subzones are able to meet, we are able to choose different shapes, colours to represent the zone.



Title: Whisker plot of temperature
Temperature whisker.png

Source: https://www.ck12.org/statistics/box-and-whisker-plots/rwa/The-Ways-of-Weather/

We are able to see the temperature for the selected area over the course of a year. The whisker plots are able to show the upper and lower boundaries of temperature, and we can observe that the temperature gradually rises to a peak from Jan to Aug, before decreasing until December.

We hope to apply this chart to display the rainfall for a selected area over the course of a year. This allows for buyers to be able to better understand the rainfall pattern in the area so that he is able to better understand if the area suits his preferences.


Title: Heatmap of rainfall
Heatmap of rainfall.png

Source: https://www.shanelynn.ie/analysis-of-weather-data-using-pandas-python-and-seaborn/

This is a heatmap of daily rainfall. Darker colours of red represent heavier rainfall.

Another way to have a visualization to understand the patterns of rainfall. Through this, we are able to quickly see how many days in a year where there is rain for a selected subzone. Assuming that a potential buyer is interested in property that sees more sunlight, he will be more interested in a subzone where the graph looks brighter. On the other hand, if the buyer is interested in a property that is always rainy, he would be interested in an area where the graph looks darker.


Title: Spatial Interpolation
Property heatmap.png

Source: https://www.srx.com.sg/heat-map/

This graph shows the property prices in Singapore for different property types. The user can choose to select different property types, and the graph will update to show only the selected property type. A variety of colours are chosen here to display different levels of prices.

Based on our problem, there are 2 key aspects that we are looking at: Prices and Weather. One way this kind of visualization could be utilized by our group is for us to use this to display prices or Weather in Singapore across all subzones. By charting either prices or the various weather types over a map of Singapore, the user will be able to quickly gain an understanding of how the different criteria that he can choose will be like across Singapore.



KEY TECHNICAL CHALLENGES & MITIGATION

No. Challenge Description Mitigation Plan
1.
Software Challenge Unfamiliarity of visualisation tools such as R, R Shiny, Tableau.
  • Github Learning
  • Stackoverflow research
  • Self-directed and peer learning
  • Watch video tutorials from YouTube
  • Hands-on practice using the different training platforms such as Data Camps
2.
Programming Challenge Inexperince with data cleaning and transformation using R
  • Trial and error
  • Read online articles and forums for guidance
  • Watch video tutorials on how to fully utilise packages such as lapply, tidyr and dplyr
3.
Workload Constraint Time and Workload Constrains
  • Design reasonable project timeline based on everyone's ability and capacity.
  • Set milestones and adjust the timeline accordingly based on the team's progress.
4.
Dataset Complexity

Our have different data from multiple sources in multiple different formats, hence we foresee a huge challenge in standardizing the data

  • Note: Our current dataset is looking at 49 areas over the spread of 8 years of data, for every year there are 12 months of data. This gives a total of 4,704 CSV files to consolidate and clean for weather data alone.
  • Make use of data preparation tools such as tableau prep
  • Make use of our database management skills to normalize all data tables into third normal form

STORYBOARD

Dashboards Description

Dashboard 1: Qualitative Thematic Map of Singapore property
VA1.jpg

Our group plans to do an interactive Qualitative Thematic Map which reflects different faces based on the year, price, weather, amenities filters adjusted by the users. This chart will show the data at a high level for users to identify which area meets their needs in a glance. Users can also view the map based on the postal area or zone.

Filters used includes:

  • Sliders
  1. Year
  2. Transacted Price
  3. Amenities
  • Single Dropdown List
  1. Weather
  2. Map level of detail
  • Multiple Drowndown List
  1. Property Type


Based on User’s adjustment for the filters, the map would reflect the user 3 different data types:

  • Weather
When the postal area/zone’s selected weather is above the median of Singapore’s data over the selected year, the chart will reflect a circle face shape. However, if the postal area/zone’s selected weather falls in the median or is lower than the median of Singapore’s data over the selected year, the chart will reflect a circle face shape with devil horns.
  • Pricing
When the postal area/zones have houses with pricing over the selected year that meets the requirements of the user based on his/her filter, the chart will reflect a smiling face. However, if the postal area/zones have houses with pricing over the selected year that is lower than the requirements of the user based on his/her filter, the chart will reflect a blank face. Lastly, if the postal area/zones have houses with pricing over the selected year that is more than the requirements of the user based on his/her filter, the chart will reflect a sad face.
  • Amenities
When the postal area/zone’s number of amenities in the selected year meets the requirements of the user based on the filter, the chart will reflect the faces to be shown in green colour. However, if the postal area/zone’s number of amenities does not meet the requirements of the user, the chart will reflect the faces in blue colour.


With this visualisation, users will be able to identify and shortlist the area(s) that meets their requirements the best.


Dashboard 2: Property prices in Postal Areas{Based on Users selected areas in storyboard 1}
VA2.jpg

The purpose of this chart is to show a detailed breakdown of the properties that meet the requirements of the Users based on his/her filters and User's shortlisted area(s) in the chart from Dashboard 1.


This chart shows all the properties and their prices for all the properties that meet the requirements of the Users based on the filter’s range and the shortlisted area(s).


Filters used includes:

  • Sliders
  1. Year
  2. Transaction Price
  3. Amenities
  • Single Dropdown List
  1. Property Type
  2. Sorted By


The chart will be able to help users better understand the property prices based on his/her shortlisted area(s) and make a decision on which area’s property to purchase.


X-Axis: Transacted Properties’ Name
Y-Axis: Transacted Pricing


This chart will be shown together with the chart in Dashboard 3 to help the buyer make the best-informed decisions.


Dashboard 3: Distribution of Rain Precipitation Amount/ Temperature/ Wind Speed in Postal Areas{Based on Users selected areas in Dashboard 1}
VA3.jpg

The purpose of this chart is to show a detailed breakdown of the weather (which includes Rain Precipitation Amount/ Temperature/ Wind Speed) for User’s shortlisted area(s) from the chart in Dashboard 1.


Filters used includes:

  • Sliders
  1. Year
  2. Month
  • Single Dropdown List
  1. Level of Detail


The user can adjust the filters to identify any patterns or trends of the weather based on the short-listed areas in Singapore. This chart can help the user better identify which area best suits the user based on his preferences and needs.


X-Axis: Level of Detail(Sub-zone/ Postal Area/ Zone)
Y-Axis: Rain Precipitation Amount/ Temperature/ Wind Speed


This chart will be shown together with the chart on Dashboard 2 to help the buyer make the best-informed decisions.


Dashboard 4: Comparing Rainfall {selected weather} and the median pricing of All Properties{selected property type}
VA4.jpg

The chart in this storyboard reflects a combination of two thematic maps, a bar chart map as well as a spatial interpolation map. Similar to the chart in Dashboard 1, this chart will show data at a high level for users to identify which area meets their needs in a glance. However, this chart is more-straightforward as the results shown in this chart is more clear cut and less informative.


In this chart, the user will be able to compare:

  • The weather of the user’s choice in each zone/postal area.
  • Median pricing of their selected property type in each zone/postal area.


Filters used includes:

  • Sliders
  1. Year
  • Single Dropdown List
  1. Level of Detail
  2. Weather
  3. Property Type


With the visualization, users will be able to compare all the areas in Singapore to make an informed decision. By doing so, the chart is able to help the user in shortlisting the area that they wish to zoom into in another view of light as compared to the chart in Dashboard 1 where detailed comparison of the whole Singapore is limited.

MILESTONES

Photo 2020-03-01 16-22-33.jpg

COMMENTS

No. Name Date Comments
1. (Name) (Date) (Comment)
2. (Name) (Date) (Comment)
3. (Name) (Date) (Comment)