Difference between revisions of "1718t1is428T11"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
<div style="font-family:Garamond;font-size:42px">
 +
<center>Data Tsunami</center>
 +
</div>
 +
<!--Logo-->
 +
[[File:Data tsunami logo.png|center|180px]]
 +
<!--/Logo-->
 +
<br/>
 
<!--Header-->
 
<!--Header-->
 
{|style="background-color:#1D1D1D; color:#F5F5F5; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#1D1D1D; color:#F5F5F5; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="padding:0.2em; font-size:100%; background-color:#07264C; text-align:center; color:#F5F5F5" width="10%" |  
+
| style="padding:0.2em; font-size:100%; background-color:#1D1D1D; text-align:center; color:#F5F5F5" width="10%" |
[[1718t1is428T11 |<font color="#F5F5F5" size=2 face="Garamond"><b>PROPOSAL</b></font>]]
+
[[Project_Groups |<font color="#F5F5F5" size=2 face="Garamond"><b>HOME</b></font>]]
 +
 
 +
| style="background:none;" width="1%" | &nbsp;
 +
| style="padding:0.2em; font-size:100%; background-color:#07264C;  border-bottom:0px solid #3D9DD7; text-align:center; color:#F5F5F5" width="10%" |  
 +
[[1718t1is428T11|<font color="#F5F5F5" size=2 face="Garamond"><b>PROPOSAL</b></font>]]
  
 
| style="background:none;" width="1%" | &nbsp;
 
| style="background:none;" width="1%" | &nbsp;
Line 22: Line 33:
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Introduction</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Introduction</font></div>==
[[File:Ebay logo.PNG|350px|center]]
+
[[File:Sgcarmart logo.jpg|450px|center]]
 
<div style="font-family:Garamond;font-size:16px">
 
<div style="font-family:Garamond;font-size:16px">
According to Forbes, the automobile industry has grown by a massive 68% since hitting a trough during the 2009 global financial crises according to a report published by car auction company Manheim earlier this year.
+
sgCarMart is one of Singapore's biggest online car resale marketplace. Specifically, it facilitates the resale of cars between a buyer and seller.
  
Q3 2016 closed with 9.8M vehicles sold in the used car market -an increase of 3.3% over the previous year. Also the average retail used vehicle sold for $19.232 in Q3 2016, an increase of 4.3% over last year. Changes in car buying behavior are beginning to alter the landscape of franchised used vehicles.
+
According to Forbes, there has been a huge increase in demand for used cars, as a result, the used car market has seen a stellar growth of up to 68% since 2009. This has led to huge changes in car buying behavior,  marketplaces like sgCarMart are one of the key platforms paving way the growth of the used car industry. As a result, we tried to understand this market and its dynamics by crawling data from the sgCarMart's website.  
  
So both franchised used car firms and other giant online marketplaces like E-Bay are leveraging the growth rate of used car industy. As a result, we tried to understand this market and its dynamics with the help of ‘Used Car Database’ from E-Bay.
 
 
</div>
 
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Problem and Motivation</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Problem and Motivation</font></div>==
 
<div style="font-family:Garamond;font-size:16px">
 
<div style="font-family:Garamond;font-size:16px">
With the <b>change in consumer car buying behaviour</b> and a <b>rising market for used cars</b>, our aim is to understand this growing used car market. When consumers look at used cars, price is the most important factor that influences opinions, but there are also few other facts that affect their purchasing decision. So we will try to find out <b>which variables affect the price most</b> and how they are correlated.
+
Prices of new cars can be <b>too expensive</b> for price sensitive individuals <b>to afford</b>. However, through the used car market one will be able to afford the convenience of owning a car. For budget conscious individuals, buying a used can be a great way to save money. On the other hand, owners of existing cars interested to make a sale can enjoy savings from its successful sale. Hence, understanding the used car market can prove to be useful for individuals looking to sell / buy a existing car.
 +
 
 +
In addition, with the <b>changing consumer car buying behavior</b> and a <b>rising market for used cars</b>, our aim is to <b>understand</b> this <b>growing used car market</b> to enable <b>better decision making</b> for the different stakeholders involved. When consumers look at used cars, usually the price is one of the most important factor that influences buying decision. In addition, we will also like to explore <b>which other variables affect the price most</b> and how they are correlated.
 +
 
 
</div>
 
</div>
  
Line 40: Line 53:
 
In this project, we are interested to create a visualisation application that helps users perform the following:
 
In this project, we are interested to create a visualisation application that helps users perform the following:
  
1. Visualise course information such as prices, popularity, class availability across all terms and bidding window<br/>
+
1. Visualise resale car prices against other factors such as:<br/>
2. View trend of bidding prices for interested course<br/>
+
*Type of cars
3. Uncover the demand and interest levels of each course based on the following variables
+
*Car Brands
*Semester (Term 1,2)
+
*Car age
*Class schedule
+
*Car Engine
*Professor
+
*COE Registered Date / COE Time Remaining
*Number of classes available
+
*Mileage
  
 +
2. Identify relationships and correlations across different factors affecting resale prices<br/>
 +
 +
3. Uncover the top 10 most common brands for car resale
 +
*Difference in prices & quantity sold across different brands
 
</div>
 
</div>
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Data</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>Data</font></div>==
 
<div style="font-family:Garamond;font-size:16px">
 
<div style="font-family:Garamond;font-size:16px">
Data used is from “OASIS Boss Bidding History Results”. There are 21 excel files from 6 academic years. Every excel file contains 17 columns: term, session, bidding window, course code, description, section, vacancy, opening vacancy, before process vacancy, D.I.C.E, after process vacancy, enrolled students, max bid, median bid, min bid, instructor and school. <br/>
+
Data used is obtained from crawling it off the sgcarmart website (http://www.sgcarmart.com/used_cars/listing.php). Specifically focusing on used cars.  
[[Image: boss-data.png |800px|center]]
+
 
 +
This dataset contains 28028 records and consists of following columns. <br/>
 +
 
 +
{| class="wikitable" style="background-color:#ffffff;" width="100%"
 +
|-
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 30%;" | Attribute
 +
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 50%;" | Description
 +
|-
 +
|Brand|| brand of the car
 +
|-
 +
|Model||car model
 +
|-
 +
|Price||the price on the ad to sell the car
 +
|-
 +
|Depreciation||average depreciation value of the car per year
 +
|-
 +
|Registration Date||date of the COE when the car is registered
 +
|-
 +
|Eng||engine of the car in terms of cc
 +
|-
 +
|Mile||distance (Km) that the car has been driven
 +
|-
 +
|Type||vehicle type
 +
|-
 +
|Status||if the car is still available for sale
 +
|-
 +
|Post date||date the advertisement was posted
 +
|-
 +
|Tags||tags associated with the advertisement
 +
|-
 +
|URL||the URL of the where the data was crawled from.
 +
|-
 +
|}
 +
 
  
 
</div>
 
</div>
Line 63: Line 113:
 
! style="font-weight: bold;background: #002060;color:#fbfcfd;" | Explaination
 
! style="font-weight: bold;background: #002060;color:#fbfcfd;" | Explaination
 
|-
 
|-
| [[Image: pc.png |500px|center]] ||  
+
| [[Image: bubble.png |500px|center]] ||  
'''Parallel Coordinates'''
+
'''Bubble Chart'''
* Parallel coordinates is a great way to visualize high-dimensional data.
+
*This figure allows us to visualize vehicle types and sales at the same time. This chart is very interactive as well. Readers can group/color the data points by “major brand”, “origin”,  “truck/car” and “gainers/losers”.  
* In our case, we would like to use opening vacancy, before process vacancy, instructor, school, term, etc. as dimensions and to see how these variables affect the bid price.
+
* It provides comprehensive insights about vehicle sales with a straightforward visualization.
* https://bl.ocks.org/jasondavies/1341281
+
* https://www.bloomberg.com/graphics/2015-auto-sales/  
  
 
|-
 
|-
|  [[Image: sp.png |500px|center]]  ||  
+
|  [[Image: scatter.png |500px|center]]  ||  
 
'''Scatter Plot'''
 
'''Scatter Plot'''
* Scatter plot can help readers to see the relationship between two variables.
+
* Scatter plot can help readers to understand the relationship between two variables. This visualization shows how age and odometer are correlated with car price.
* In our case, we would like to see which variables are highly correlated to bidding price so that we could have a better understanding of the bidding behaviors.
+
* In our case, we would like to investigate which factors contribute to the price of the used cars.
* http://bl.ocks.org/bunkat/2595950
+
*http://csidsocialmedia.github.io/2014/05/02/Predict-second-hand-car-price-using-artificial-neural-network.html
 
|-
 
|-
| [[Image: p2.jpg |500px|center]]  ||  
+
| [[Image: zq-line.png |500px|center]]  ||  
'''Stock price movement'''
+
'''Line Graph'''
* It gives an overview of how the stock price changes over the time.  
+
* This line graph is useful for displaying the price changes of the used cars over time.
* In our case, we would like to see how the bidding price change in each window in each round. It would help us understand popular trends in bidding activities.
+
* Although this graph is not interactive, it helps the readers to detect patterns or trends.  
* http://active-analytics.com/blog/plottinglivechartswithyahoofinancedataandggplot2inr/
+
* Line graph is also good at comparison. In our case, we could use the line graph to compare the price changes of different car models to discover useful insights.
 +
* http://www.zerohedge.com/news/2017-05-21/perfect-storm-hits-used-car-values-foundation-auto-industry-faltering
 
|}
 
|}
  
Line 100: Line 151:
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
{| class="wikitable" style="background-color:#FFFFFF;" width="100%"
 
|-
 
|-
! style="font-weight: bold;background: #536a87;color:#fbfcfd;width: 50%;" | Technical Challenges
+
! style="font-weight: bold;background: #002060;color:#fbfcfd;width: 50%;" | Technical Challenges
! style="font-weight: bold;background: #536a87;color:#fbfcfd;" | Action Plan
+
! style="font-weight: bold;background: #002060;color:#fbfcfd;" | Action Plan
 
|-
 
|-
 
| <center> Data Preparation </center> ||  
 
| <center> Data Preparation </center> ||  
*Data collection: work together to export data from BOSS.
+
*Work on data cleaning and transforming.
*Data cleaning: work together to clean and analyse the data.
 
 
|-
 
|-
 
| <center> Unfamiliarity in Programming Language like Javascript & Libraries like D3 </center> ||  
 
| <center> Unfamiliarity in Programming Language like Javascript & Libraries like D3 </center> ||  
Line 125: Line 175:
  
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>References</font></div>==
 
==<div style="background: #07264C; padding: 15px; line-height: 0.3em; text-indent: 15px; font-size:18px; font-family:Garamond"><font color= #FFFFFF>References</font></div>==
*Boss Bidding: https://oasis.smu.edu.sg/Pages/RO/All-About-BOSS.aspx
+
*Bloomberg - Scientific Proof that Americans are Completely Addicted to Trucks: https://www.bloomberg.com/graphics/2015-auto-sales/
*Parallel Coordinates: https://bl.ocks.org/jasondavies/1341281
+
*Predict second hand car price using artificial neural network: http://csidsocialmedia.github.io/2014/05/02/Predict-second-hand-car-price-using-artificial-neural-network.html
*Simple Scatter Chart Example: http://bl.ocks.org/bunkat/2595950
+
*The Perfect Storm Hits Used-Car Values: The Foundation Of The Auto Industry Is Faltering: http://www.zerohedge.com/news/2017-05-21/perfect-storm-hits-used-car-values-foundation-auto-industry-faltering
*Stock Price: http://active-analytics.com/blog/plottinglivechartswithyahoofinancedataandggplot2inr/
+
*Kaggle - Used car database: https://www.kaggle.com/orgesleka/used-cars-database
 +
*Kaggle - Data Crunchers: https://www.kaggle.com/timucinanuslu/data-crunchers
 
*D3.js: https://d3js.org/
 
*D3.js: https://d3js.org/
 
*Chart.js: http://www.chartjs.org/
 
*Chart.js: http://www.chartjs.org/
  
 
<!--/Content-->
 
<!--/Content-->

Latest revision as of 20:35, 8 November 2017

Data Tsunami
Data tsunami logo.png


HOME

 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER


Introduction

Sgcarmart logo.jpg

sgCarMart is one of Singapore's biggest online car resale marketplace. Specifically, it facilitates the resale of cars between a buyer and seller.

According to Forbes, there has been a huge increase in demand for used cars, as a result, the used car market has seen a stellar growth of up to 68% since 2009. This has led to huge changes in car buying behavior, marketplaces like sgCarMart are one of the key platforms paving way the growth of the used car industry. As a result, we tried to understand this market and its dynamics by crawling data from the sgCarMart's website.

Problem and Motivation

Prices of new cars can be too expensive for price sensitive individuals to afford. However, through the used car market one will be able to afford the convenience of owning a car. For budget conscious individuals, buying a used can be a great way to save money. On the other hand, owners of existing cars interested to make a sale can enjoy savings from its successful sale. Hence, understanding the used car market can prove to be useful for individuals looking to sell / buy a existing car.

In addition, with the changing consumer car buying behavior and a rising market for used cars, our aim is to understand this growing used car market to enable better decision making for the different stakeholders involved. When consumers look at used cars, usually the price is one of the most important factor that influences buying decision. In addition, we will also like to explore which other variables affect the price most and how they are correlated.

Objective

In this project, we are interested to create a visualisation application that helps users perform the following:

1. Visualise resale car prices against other factors such as:

  • Type of cars
  • Car Brands
  • Car age
  • Car Engine
  • COE Registered Date / COE Time Remaining
  • Mileage

2. Identify relationships and correlations across different factors affecting resale prices

3. Uncover the top 10 most common brands for car resale

  • Difference in prices & quantity sold across different brands

Data

Data used is obtained from crawling it off the sgcarmart website (http://www.sgcarmart.com/used_cars/listing.php). Specifically focusing on used cars.

This dataset contains 28028 records and consists of following columns.

Attribute Description
Brand brand of the car
Model car model
Price the price on the ad to sell the car
Depreciation average depreciation value of the car per year
Registration Date date of the COE when the car is registered
Eng engine of the car in terms of cc
Mile distance (Km) that the car has been driven
Type vehicle type
Status if the car is still available for sale
Post date date the advertisement was posted
Tags tags associated with the advertisement
URL the URL of the where the data was crawled from.


Research Visualisation

Visualizations Explaination
Bubble.png

Bubble Chart

  • This figure allows us to visualize vehicle types and sales at the same time. This chart is very interactive as well. Readers can group/color the data points by “major brand”, “origin”, “truck/car” and “gainers/losers”.
  • It provides comprehensive insights about vehicle sales with a straightforward visualization.
  • https://www.bloomberg.com/graphics/2015-auto-sales/
Scatter.png

Scatter Plot

Zq-line.png

Line Graph

Tools

 -Excel

 -D3

 -Javascript

 -Github

 -Tableau

Technical Challenges

Technical Challenges Action Plan
Data Preparation
  • Work on data cleaning and transforming.
Unfamiliarity in Programming Language like Javascript & Libraries like D3
  • Initial hands-on experience during D3.js workshop.
  • Independent learning on Javascript & D3.js.
  • Peer learning and sharing of skills.
Unfamiliarity in Implementing Interactive Visualisation App
  • Self-learning and view online tutorials.

Roles & Milestones

  • Project Roles

Dong Ruiyan: Visualisation Analyst
Zhang Qian: Visualisation Designer
Jeremy LEE Ting Kok: Project Manager

  • Project Timeline
Timeline.png

References