Difference between revisions of "Cupid Minions"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
 
(20 intermediate revisions by 2 users not shown)
Line 4: Line 4:
  
 
{|style="padding: 5px 0 0 0;" width="100%" cellspacing="0" cellpadding="0" valign="top"|
 
{|style="padding: 5px 0 0 0;" width="100%" cellspacing="0" cellpadding="0" valign="top"|
| style="background-color:#465d66; text-align:center;" width="25%" |  
+
| style="background-color:#FF5778; text-align:center;" width="25%" |  
 
[[Team Cupid Minions | <font color="#ffffff" size=2><b>PROPOSAL</b></font>]]
 
[[Team Cupid Minions | <font color="#ffffff" size=2><b>PROPOSAL</b></font>]]
 
| style="background-color:#ffb7c5; text-align:center;" width="25%" |  
 
| style="background-color:#ffb7c5; text-align:center;" width="25%" |  
Line 11: Line 11:
 
[[Team Cupid Minions Application | <font color="#ffffff" size=2><b>APPLICATION</b></font>]]
 
[[Team Cupid Minions Application | <font color="#ffffff" size=2><b>APPLICATION</b></font>]]
 
| style="background-color:#ffb7c5; text-align:center;" width="25%" |  
 
| style="background-color:#ffb7c5; text-align:center;" width="25%" |  
[[Team Cupid Minions Research Paper | <font color="#ffffff" size=2><b>RESEARCH PAPER</b></font>]]
+
[[Team Cupid Minions Report | <font color="#ffffff" size=2><b>REPORT</b></font>]]
 
|}  
 
|}  
  
 
<br>
 
<br>
 
==Problem and Motivation==
 
==Problem and Motivation==
In United Kingdom (UK), road traffic accidents have resulted in 1,732 deaths in the year 2015, which is a 2% dip as compared to the year 2014. Despite the drop in the number of deaths, the casualties across all severities remained at an alarming figure of 186,209. As there is an increasing demand for the use of public roads, there is a strong need for us to prevent road traffic accidents and make the public roads as safe as possible. In order to prevent such accidents, it is therefore crucial to understand what are the different factors that contribute to road traffic accidents, and these understandings may then be used to prevent road traffic accidents from occurring.
+
Low marriage rate and increasing divorce rate has been prominent issues in the developed world. One of the reasons for causing these issues is the wrong selection of marriage partner. Some couples only know that they don’t suit each other after marriage. This creates social issues like increaing divorce rate and domestic violence. Therefore, choosing the correct marriage partner is a prompt need for the young adults nowadays. In order to increase the successful matching rate for couples and make sure people can get the most satisfied partners, there is a need to understand these partner-seeking people’s characterstics, demographics, habits and lifestyle informations, etc. From the insight brought by the data, we could suggest the factors which contribute to successful and rapid matching and as a result help the partner-seekers.
 
<br />
 
<br />
  
 
==Objective==
 
==Objective==
The objective of the project is to:
+
Our project aims to explore the potential attributes contributing to people’s decision during speed-dating. This enables us to have a deeper understanding of dating preferences that examine a large variety of social issues. Our fields of interests range from race relations to intergenerational mobility, which largely depends upon outcomes of marriage market. <br>
* Understand the demographics of drivers
+
 
*# Distribution of age of drivers.
+
The detail questions for directing data analysis in this project are:
* Understand the demographics of casualties
+
* How does external facters affect one’s choice of partner:
*# Distribution of age of casualties.
+
**e.g. income and current career
*# Distribution of severity of casualties.
+
* How does demographics affect one’s choice of partner:
*# Distribution of type of casualties.
+
** e.g. race
* Examine the underlying factors which contributes to accidents. The following are some factors, but not limited to:
+
* What are the most desirable and least desirable attributes for choosing opposite sex partners:
*# Temporal patterns: Accident records based on time.
+
** attractiveness.
*# Weather conditions: Which type of weather conditions would cause more accidents?
+
** sincere
*# Road conditions: Which type of road conditions would cause more accidents?
+
** intelligent
*# Location: Which city has the most accidents?
+
** fun
* Develop appropriate interactive visualisation to allow discovery of insights from multiple dimensions from the dataset.
+
** ambitious
 +
** shared interests
  
 
==Data==
 
==Data==
In this project, our team will be focusing on 2015 road safety data. The data is obtained from data.gov.uk (https://data.gov.uk/dataset/road-accidents-safety-data). It contains only personal injury accidents on public roads that are reported to the police and recorded using the UK STATS19 accident reporting form. It consists a total of 3 datasets that provide information about the accidents, the types of vehicles involved and the demographics of the casualties. Most of the data attributes are coded and re-coding would be done with the lookup tables provided by data.gov.uk.
+
Data used in this project was compiled by Columbia Business School professors Ray Fisman and Sheena Iyengar for their paper [http://faculty.chicagobooth.edu/emir.kamenica/documents/genderDifferences.pdf Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment.] <br />
<br /><br />
+
 
The following data attributes are used in this project:
+
This dataset was originally collected from experimental speed dating events participants during the period of 2002-2004. Participants were given four minutes 'speed dating' time with another participant of opposite sex. After their 'speed dating', they were requested to rate their corresponding 'dating partners' on six aspects: <br />
* '''Accident Dataset''' ''[Accidents_2015.csv]''
+
* Attractiveness
*#Accident_Index ''[Accident No.]''
+
* Sincerity
*#Longitude
+
* Intelligence
*#Latitude
+
* Fun
*#Day_of_Week
+
* Ambition
*#Time
+
* Shared Interests
*#Local_Authority_(District) ''[City Name]''
+
 
*#Weather_Conditions
+
On top of which, questionnaire data was collected from the participants and recorded in the dataset throughout the process. Questionnaire data that we may find relevant and useful include: <br />
*#Road_Surface_Conditions
+
* Demographics
* '''Vehicles Dataset''' ''[Vehicles_2015.csv]''
+
* Dating habits
*#Accident_Index ''[Accident No.]''
+
* Self-perception across key attributes
*#Vehicle_Reference ''[Vehicle No.]''
+
* Beliefs on what others find valuable in a mate
*#Age_of_Driver
+
* Lifestyle information
* '''Casualties Dataset''' ''[Casualties_2015.csv]''
 
*#Accident_Index ''[Accident No.]''
 
*#Vehicle_Reference ''[Vehicle No.]''
 
*#Casualty_Reference ''[Casualty No.]''
 
*#Casualty_Class ''[Driver/Rider, Passenger or Pedestrian]''
 
*#Age_of_Casualty
 
*#Casualty_Severity
 
  
 
==Research Visualisation==
 
==Research Visualisation==
Line 64: Line 58:
 
! style="background: #ffb7c5; color: white; font-weight: bold;" |Comments
 
! style="background: #ffb7c5; color: white; font-weight: bold;" |Comments
 
|-
 
|-
||[[File:TeamCollision_ResearchViz_1.JPG|500px|center]]
+
||[[File:Cupid Visualization.png|500px|center]]
|<center>'''Interactive Visualisation to Track Fatal Accidents <br>(http://news.bbc.co.uk/2/hi/in_depth/uk/2009/crash/8414354.stm)'''</center>
+
|<center>'''Interactive Demographic Graph Using D3.js with Slider and Play Button'''</center>
*The pie-chart on the left allows you to select the desired category of data to display.
+
*http://www.codeproject.com/Articles/1089925/Build-a-Demographic-Data-Visualization-Tool-Based
*The radial bar chart will then allow you to look at the distinctive pattern of each age group over a time-period.
+
*Highly interactive (slider, filters)
*The radial bar chart will be beneficial for high number of bins, where we will be able to look at all the bars or columns from one view without scrolling back and forth.
+
*Suitable for data with high number of dimensions (In our case, the attributes include gender, race, education, interests and a lot more)
 
|-
 
|-
 +
 
|-
 
|-
||[[File:TeamCollision_ResearchViz_2.JPG|400px|center]]
+
||[[File:Chernoff Face.png|thumbnail|500px|center]]
|<center>'''Interactive Visualization to Rush Hour Danger <br> (http://www.bbc.co.uk/news/uk-15975564)'''</center>
+
|<center>'''Chernoff Face'''</center>
*Visualization of the pattern of pedestrian casualties across the week
+
*http://mathworld.wolfram.com/ChernoffFace.html
*Map which shows the deaths involving pedestrians and buses on London’s world-famous Oxford Street between 1999 and 2010
+
*Face shape is suitable for visualizing demographics data
*Although this visualization is interesting with the data points of casualties over the years, it lacks the interactivity that our team envision our storyboard to be. We will use this visualization as a reference when we implement an interactive map in our storyboard.
+
*Intuitive for displaying multiple variables and comparison (In our case, the attributes include gender, race, education, interests and a lot more)
 
|-
 
|-
|}
 
  
==Visualisation Strategy==
 
{| class="wikitable" style="text-align:center; background: white; margin: 0px; width:100%;"
 
 
|-
 
|-
||[[File:Team_Collision_Visualization_Strategy.png|700px|center]]
+
||[[File:Female Perspective.png|thumbnail|500px|center]]
|We intend to use a top-down approach, where our visualisation is being segmented into 3 major portions.  
+
|<center>'''Dating Dashboard Design'''</center>
 
+
*http://www.valetmag.com/living/features/2010/the-female-perspective.php
From the top, the first visualisation (Top-Left) is a map of UK, where it allows users to focus on different cities of UK.  
+
* The Female perspective on dating Dashboard design
 
 
The second visualisation (Top-Right) serves as an intermediate navigation-step, where users can then focus on more in-depth details such as the different casualty types.
 
 
 
Lastly, at the bottom section shows the underlying factors. Through the first 2 visualisations, users will be able to study how the casualties are associated with the underlying factors.
 
 
|-
 
|-
 
|}
 
|}
  
 
==Tools==
 
==Tools==
The following tools are used in this project:  
+
Following tools are expected to be utilized through this project:  
*JMP Pro
+
* Tableau DeskTop 10
*Tableau DeskTop 10
+
* JMP Pro
*D3.js
+
* Javascript
*Excel
+
* D3.js
 +
* JQuery
 +
* Brackets
 +
* Github
 +
* Excel
  
 
==Technical Challenges==
 
==Technical Challenges==
Line 105: Line 97:
 
! style="background: #ffb7c5; color: white; font-weight: bold; width: 60%;" |Action Plan
 
! style="background: #ffb7c5; color: white; font-weight: bold; width: 60%;" |Action Plan
 
|-
 
|-
|<center>Data Prepation</center>
+
|<center>Data Preprocessing - 195 Variables & 8378 Datasets</center>
 
|
 
|
* Work together to ensure standardization in preparing the data
+
* Collaborative team work in data cleaning, selection and transformation.
 
|-
 
|-
 
|-
 
|-
|<center>No Prior Experience using D3.js<br>
+
|<center>Lack of technical background in programming languages like, javascript and libraries like, D3.js and JQuery.<br></center>
Implementation of Interactivity/Animation</center>
 
 
|
 
|
* Peer learning
+
* Initial hands-on experience during D3.js workshop
* Visualize graphs in JMP Pro/Tableau to understand how the graphs should be like
+
* Peer learning and sharing of skills developed during IS 480
* View online tutorials/examples at d3js.org
+
* Arranging frequent consultation with Instructor Prakash regarding technical difficulties encountered along the way
* Source for libraries that can be used
+
 
* Follow schedule accordingly to ensure ample time for deployment
+
|-
 +
|-
 +
|<center>Unfamiliar with implementing interactive visual analytics application<br></center>
 +
|
 +
* Self-learning and practicing via online tutorials
 +
* Continuous exploration of readily available alternative tools
 
|-
 
|-
 
|-
 
|-
 
|}
 
|}
  
==Milestones==
+
==Roles & Milestones==
 +
[[File:Cupid Roles.png]]
 +
[[File:Timeline.png|1000px]]
  
 +
==References==
 +
# [http://faculty.chicagobooth.edu/emir.kamenica/documents/genderDifferences.pdf Reasearch paper: Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment. <br /> (Raymond Fisman Sheena S. Iyengar Emir Kamenica Itamar Simonson)]
 +
# [http://faculty.chicagobooth.edu/emir.kamenica/documents/racialpreferences.pdf Racial Preferences in Dating <br /> (Raymond Fisman Sheena S. Iyengar Emir Kamenica Itamar Simonson)]
 +
# [https://www.kaggle.com/annavictoria/speed-dating-experiment/downloads/speed-dating-experiment.zip Speed Dating Experiment Data]
 +
# [http://www.codeproject.com/Articles/1089925/Build-a-Demographic-Data-Visualization-Tool-Based Build a Demographic Data Visualization Tool Based On D3.js]
 +
# [http://www.datavizcatalogue.com/index.html The Data Visualization Catalog]
 +
# [https://github.com/d3/d3/wiki/Gallery D3 Gallery]<br /> [http://www.chartjs.org/ Chart.js] <br /> [http://www.fusioncharts.com/javascript-charting-comparison/ JavaScript Chart Comparison]
  
==References==
 
  
  
 
==Comments==
 
==Comments==

Latest revision as of 15:26, 9 October 2016

Cupid Minions.png

PROPOSAL

POSTER

APPLICATION

REPORT


Problem and Motivation

Low marriage rate and increasing divorce rate has been prominent issues in the developed world. One of the reasons for causing these issues is the wrong selection of marriage partner. Some couples only know that they don’t suit each other after marriage. This creates social issues like increaing divorce rate and domestic violence. Therefore, choosing the correct marriage partner is a prompt need for the young adults nowadays. In order to increase the successful matching rate for couples and make sure people can get the most satisfied partners, there is a need to understand these partner-seeking people’s characterstics, demographics, habits and lifestyle informations, etc. From the insight brought by the data, we could suggest the factors which contribute to successful and rapid matching and as a result help the partner-seekers.

Objective

Our project aims to explore the potential attributes contributing to people’s decision during speed-dating. This enables us to have a deeper understanding of dating preferences that examine a large variety of social issues. Our fields of interests range from race relations to intergenerational mobility, which largely depends upon outcomes of marriage market.

The detail questions for directing data analysis in this project are:

  • How does external facters affect one’s choice of partner:
    • e.g. income and current career
  • How does demographics affect one’s choice of partner:
    • e.g. race
  • What are the most desirable and least desirable attributes for choosing opposite sex partners:
    • attractiveness.
    • sincere
    • intelligent
    • fun
    • ambitious
    • shared interests

Data

Data used in this project was compiled by Columbia Business School professors Ray Fisman and Sheena Iyengar for their paper Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment.

This dataset was originally collected from experimental speed dating events participants during the period of 2002-2004. Participants were given four minutes 'speed dating' time with another participant of opposite sex. After their 'speed dating', they were requested to rate their corresponding 'dating partners' on six aspects:

  • Attractiveness
  • Sincerity
  • Intelligence
  • Fun
  • Ambition
  • Shared Interests

On top of which, questionnaire data was collected from the participants and recorded in the dataset throughout the process. Questionnaire data that we may find relevant and useful include:

  • Demographics
  • Dating habits
  • Self-perception across key attributes
  • Beliefs on what others find valuable in a mate
  • Lifestyle information

Research Visualisation

Visualisations Comments
Cupid Visualization.png
Interactive Demographic Graph Using D3.js with Slider and Play Button
Chernoff Face.png
Chernoff Face
  • http://mathworld.wolfram.com/ChernoffFace.html
  • Face shape is suitable for visualizing demographics data
  • Intuitive for displaying multiple variables and comparison (In our case, the attributes include gender, race, education, interests and a lot more)
Female Perspective.png
Dating Dashboard Design

Tools

Following tools are expected to be utilized through this project:

  • Tableau DeskTop 10
  • JMP Pro
  • Javascript
  • D3.js
  • JQuery
  • Brackets
  • Github
  • Excel

Technical Challenges

Technical Challenges Action Plan
Data Preprocessing - 195 Variables & 8378 Datasets
  • Collaborative team work in data cleaning, selection and transformation.
Lack of technical background in programming languages like, javascript and libraries like, D3.js and JQuery.
  • Initial hands-on experience during D3.js workshop
  • Peer learning and sharing of skills developed during IS 480
  • Arranging frequent consultation with Instructor Prakash regarding technical difficulties encountered along the way
Unfamiliar with implementing interactive visual analytics application
  • Self-learning and practicing via online tutorials
  • Continuous exploration of readily available alternative tools

Roles & Milestones

Cupid Roles.png Timeline.png

References

  1. Reasearch paper: Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment.
    (Raymond Fisman Sheena S. Iyengar Emir Kamenica Itamar Simonson)
  2. Racial Preferences in Dating
    (Raymond Fisman Sheena S. Iyengar Emir Kamenica Itamar Simonson)
  3. Speed Dating Experiment Data
  4. Build a Demographic Data Visualization Tool Based On D3.js
  5. The Data Visualization Catalog
  6. D3 Gallery
    Chart.js
    JavaScript Chart Comparison


Comments