Difference between revisions of "1718t1is428T11"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(53 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
<div style="font-family:Garamond;font-size:42px">
 +
<center>Data Tsunami</center>
 +
</div>
 +
<!--Logo-->
 +
[[File:Data tsunami logo.png|center|180px]]
 +
<!--/Logo-->
 +
<br/>
 
<!--Header-->
 
<!--Header-->
 
{|style="background-color:#1D1D1D; color:#F5F5F5; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#1D1D1D; color:#F5F5F5; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="padding:0.2em; font-size:100%; background-color:#07264C; text-align:center; color:#F5F5F5" width="10%" |  
+
| style="padding:0.2em; font-size:100%; background-color:#1D1D1D; text-align:center; color:#F5F5F5" width="10%" |
[[1718t1is428T11 |<font color="#F5F5F5" size=2 face="Garamond"><b>PROPOSAL</b></font>]]
+
[[Project_Groups |<font color="#F5F5F5" size=2 face="Garamond"><b>HOME</b></font>]]
 +
 
 +
| style="background:none;" width="1%" | &nbsp;
 +
| style="padding:0.2em; font-size:100%; background-color:#07264C;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[1718t1is428T11|<font color="#F5F5F5" size=2 face="Garamond"><b>PROPOSAL</b></font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
Line 22: Line 33:
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Introduction</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Introduction</font></div>==
[[File:Boss bidding logo.PNG|350px|center]]
+
[[File:Sgcarmart logo.jpg|450px|center]]
 
<div style="font-family:Garamond;font-size:16px">
 
<div style="font-family:Garamond;font-size:16px">
BOSS (Bidding Online SyStem) is SMU's system for the registration of courses. BOSS is intended to empower students with the choice of selecting the courses/workshop they wish to enrol for in any one term. This choice is not necessarily an easy one. Based on available resources in the form of e-Dollars (e$) and e-Points (e-pt), students will need to make their choice by carefully considering the demand and supply of the courses/workshop as well as their academic study plan not just for the current term but also the future terms.
+
sgCarMart is one of Singapore's biggest online car resale marketplace. Specifically, it facilitates the resale of cars between a buyer and seller.
 +
 
 +
According to Forbes, there has been a huge increase in demand for used cars, as a result, the used car market has seen a stellar growth of up to 68% since 2009. This has led to huge changes in car buying behavior,  marketplaces like sgCarMart are one of the key platforms paving way the growth of the used car industry. As a result, we tried to understand this market and its dynamics by crawling data from the sgCarMart's website.  
 +
 
 
</div>
 
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Problem and Motivation</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Problem and Motivation</font></div>==
 
<div style="font-family:Garamond;font-size:16px">
 
<div style="font-family:Garamond;font-size:16px">
With every bidding season, SMU students are faced with an <b>uncertainty of the amount of e-Dollars (e$) to bid for the courses that they are interested to take the next semester</b>. Often this arises due to the lack of insights and visualisation tools to effectively indicate past trends and details of courses. Despite past course bidding results data made publically available to students, it is difficult to make the right decisions without the support of any data aggregation and visualisation. Furthermore, it is <b>difficult to gauge the demand and supply of upcoming courses/workshops</b>. Hence there is a need to provide better decision support tools for student course bidding, we will create a visualisation application to help students and faculty understand the behaviour, interest and patterns of SMU courses .  
+
Prices of new cars can be <b>too expensive</b> for price sensitive individuals <b>to afford</b>. However, through the used car market one will be able to afford the convenience of owning a car. For budget conscious individuals, buying a used can be a great way to save money. On the other hand, owners of existing cars interested to make a sale can enjoy savings from its successful sale. Hence, understanding the used car market can prove to be useful for individuals looking to sell / buy a existing car.
 +
 
 +
In addition, with the <b>changing consumer car buying behavior</b> and a <b>rising market for used cars</b>, our aim is to <b>understand</b> this <b>growing used car market</b> to enable <b>better decision making</b> for the different stakeholders involved. When consumers look at used cars, usually the price is one of the most important factor that influences buying decision. In addition, we will also like to explore <b>which other variables affect the price most</b> and how they are correlated.
 +
 
 
</div>
 
</div>
  
Line 36: Line 53:
 
In this project, we are interested to create a visualisation application that helps users perform the following:
 
In this project, we are interested to create a visualisation application that helps users perform the following:
  
1. Visualise course information such as prices, popularity, class availability across all terms and bidding window<br/>
+
1. Visualise resale car prices against other factors such as:<br/>
2. View trend of bidding prices for interested course<br/>
+
*Type of cars
3. Uncover the demand and interest levels of each course based on the following variables
+
*Car Brands
*Semester (Term 1,2)
+
*Car age
*Class schedule
+
*Car Engine
*Professor
+
*COE Registered Date / COE Time Remaining
*Number of classes available
+
*Mileage
  
Such visualisations will help uncover bidding behaviour and interest of the course that might prove useful for students, faculty, and course coordinators. The application will not only provide useful insights for students, it will also benefit the faculty and course coordinators in understanding courses that are high in demand. Which will allow them to allocate resources more effectively.
+
2. Identify relationships and correlations across different factors affecting resale prices<br/>
  
 +
3. Uncover the top 10 most common brands for car resale
 +
*Difference in prices & quantity sold across different brands
 
</div>
 
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Data</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Data</font></div>==
 +
<div style="font-family:Garamond;font-size:16px">
 +
Data used is obtained from crawling it off the sgcarmart website (http://www.sgcarmart.com/used_cars/listing.php). Specifically focusing on used cars.
 +
 +
This dataset contains 28028 records and consists of following columns. <br/>
 +
 +
{| class="wikitable" style="background-color:#ffffff;" width="100%"
 +
|-
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 30%;" | Attribute
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 50%;" | Description
 +
|-
 +
|Brand|| brand of the car
 +
|-
 +
|Model||car model
 +
|-
 +
|Price||the price on the ad to sell the car
 +
|-
 +
|Depreciation||average depreciation value of the car per year
 +
|-
 +
|Registration Date||date of the COE when the car is registered
 +
|-
 +
|Eng||engine of the car in terms of cc
 +
|-
 +
|Mile||distance (Km) that the car has been driven
 +
|-
 +
|Type||vehicle type
 +
|-
 +
|Status||if the car is still available for sale
 +
|-
 +
|Post date||date the advertisement was posted
 +
|-
 +
|Tags||tags associated with the advertisement
 +
|-
 +
|URL||the URL of the where the data was crawled from.
 +
|-
 +
|}
 +
 +
 +
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Research Visualisation</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Research Visualisation</font></div>==
 +
{| class="wikitable" style="background-color:#ffffff;" width="100%"
 +
|-
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 50%;" | Visualizations
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;" | Explaination
 +
|-
 +
| [[Image: bubble.png |500px|center]] ||
 +
'''Bubble Chart'''
 +
*This figure allows us to visualize vehicle types and sales at the same time. This chart is very interactive as well. Readers can group/color the data points by “major brand”, “origin”,  “truck/car” and “gainers/losers”.
 +
* It provides comprehensive insights about vehicle sales with a straightforward visualization.
 +
* https://www.bloomberg.com/graphics/2015-auto-sales/
 +
 +
|-
 +
|  [[Image: scatter.png |500px|center]]  ||
 +
'''Scatter Plot'''
 +
* Scatter plot can help readers to understand the relationship between two variables. This visualization shows how age and odometer are correlated with car price.
 +
* In our case, we would like to investigate which factors contribute to the price of the used cars.
 +
*http://csidsocialmedia.github.io/2014/05/02/Predict-second-hand-car-price-using-artificial-neural-network.html
 +
|-
 +
| [[Image: zq-line.png |500px|center]]  ||
 +
'''Line Graph'''
 +
* This line graph is useful for displaying the price changes of the used cars over time.
 +
* Although this graph is not interactive, it helps the readers to detect patterns or trends.
 +
* Line graph is also good at comparison. In our case, we could use the line graph to compare the price changes of different car models to discover useful insights.
 +
* http://www.zerohedge.com/news/2017-05-21/perfect-storm-hits-used-car-values-foundation-auto-industry-faltering
 +
|}
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Tools</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Tools</font></div>==
 +
<div style="font-family:Garamond;font-size:16px">
 +
&emsp;-Excel
 +
 +
&emsp;-D3
 +
 +
&emsp;-Javascript
 +
 +
&emsp;-Github
 +
 +
&emsp;-Tableau
 +
 +
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Technical Challenges</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Technical Challenges</font></div>==
 +
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 +
|-
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 50%;" | Technical Challenges
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;" | Action Plan
 +
|-
 +
| <center> Data Preparation </center> ||
 +
*Work on data cleaning and transforming.
 +
|-
 +
| <center> Unfamiliarity in Programming Language like Javascript & Libraries like D3 </center> ||
 +
* Initial hands-on experience during D3.js workshop.
 +
* Independent learning on Javascript & D3.js.
 +
* Peer learning and sharing of skills.
 +
|-
 +
| <center> Unfamiliarity in Implementing Interactive Visualisation App </center> ||
 +
* Self-learning and view online tutorials.
 +
|}
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Roles & Milestones </font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Roles & Milestones </font></div>==
 +
*Project Roles
 +
Dong Ruiyan: Visualisation Analyst
 +
<br>Zhang Qian: Visualisation Designer
 +
<br>Jeremy LEE Ting Kok: Project Manager
 +
*Project Timeline
 +
[[Image: Timeline.png |800px|center]]
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>References</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>References</font></div>==
 +
*Bloomberg - Scientific Proof that Americans are Completely Addicted to Trucks: https://www.bloomberg.com/graphics/2015-auto-sales/
 +
*Predict second hand car price using artificial neural network: http://csidsocialmedia.github.io/2014/05/02/Predict-second-hand-car-price-using-artificial-neural-network.html
 +
*The Perfect Storm Hits Used-Car Values: The Foundation Of The Auto Industry Is Faltering: http://www.zerohedge.com/news/2017-05-21/perfect-storm-hits-used-car-values-foundation-auto-industry-faltering
 +
*Kaggle - Used car database: https://www.kaggle.com/orgesleka/used-cars-database
 +
*Kaggle - Data Crunchers: https://www.kaggle.com/timucinanuslu/data-crunchers
 +
*D3.js: https://d3js.org/
 +
*Chart.js: http://www.chartjs.org/
  
 
<!--/Content-->
 
<!--/Content-->

Latest revision as of 20:35, 8 November 2017

Data Tsunami
Data tsunami logo.png


HOME

 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER


Introduction

Sgcarmart logo.jpg

sgCarMart is one of Singapore's biggest online car resale marketplace. Specifically, it facilitates the resale of cars between a buyer and seller.

According to Forbes, there has been a huge increase in demand for used cars, as a result, the used car market has seen a stellar growth of up to 68% since 2009. This has led to huge changes in car buying behavior, marketplaces like sgCarMart are one of the key platforms paving way the growth of the used car industry. As a result, we tried to understand this market and its dynamics by crawling data from the sgCarMart's website.

Problem and Motivation

Prices of new cars can be too expensive for price sensitive individuals to afford. However, through the used car market one will be able to afford the convenience of owning a car. For budget conscious individuals, buying a used can be a great way to save money. On the other hand, owners of existing cars interested to make a sale can enjoy savings from its successful sale. Hence, understanding the used car market can prove to be useful for individuals looking to sell / buy a existing car.

In addition, with the changing consumer car buying behavior and a rising market for used cars, our aim is to understand this growing used car market to enable better decision making for the different stakeholders involved. When consumers look at used cars, usually the price is one of the most important factor that influences buying decision. In addition, we will also like to explore which other variables affect the price most and how they are correlated.

Objective

In this project, we are interested to create a visualisation application that helps users perform the following:

1. Visualise resale car prices against other factors such as:

  • Type of cars
  • Car Brands
  • Car age
  • Car Engine
  • COE Registered Date / COE Time Remaining
  • Mileage

2. Identify relationships and correlations across different factors affecting resale prices

3. Uncover the top 10 most common brands for car resale

  • Difference in prices & quantity sold across different brands

Data

Data used is obtained from crawling it off the sgcarmart website (http://www.sgcarmart.com/used_cars/listing.php). Specifically focusing on used cars.

This dataset contains 28028 records and consists of following columns.

Attribute Description
Brand brand of the car
Model car model
Price the price on the ad to sell the car
Depreciation average depreciation value of the car per year
Registration Date date of the COE when the car is registered
Eng engine of the car in terms of cc
Mile distance (Km) that the car has been driven
Type vehicle type
Status if the car is still available for sale
Post date date the advertisement was posted
Tags tags associated with the advertisement
URL the URL of the where the data was crawled from.


Research Visualisation

Visualizations Explaination
Bubble.png

Bubble Chart

  • This figure allows us to visualize vehicle types and sales at the same time. This chart is very interactive as well. Readers can group/color the data points by “major brand”, “origin”, “truck/car” and “gainers/losers”.
  • It provides comprehensive insights about vehicle sales with a straightforward visualization.
  • https://www.bloomberg.com/graphics/2015-auto-sales/
Scatter.png

Scatter Plot

Zq-line.png

Line Graph

Tools

 -Excel

 -D3

 -Javascript

 -Github

 -Tableau

Technical Challenges

Technical Challenges Action Plan
Data Preparation
  • Work on data cleaning and transforming.
Unfamiliarity in Programming Language like Javascript & Libraries like D3
  • Initial hands-on experience during D3.js workshop.
  • Independent learning on Javascript & D3.js.
  • Peer learning and sharing of skills.
Unfamiliarity in Implementing Interactive Visualisation App
  • Self-learning and view online tutorials.

Roles & Milestones

  • Project Roles

Dong Ruiyan: Visualisation Analyst
Zhang Qian: Visualisation Designer
Jeremy LEE Ting Kok: Project Manager

  • Project Timeline
Timeline.png

References