1718t1is428T9

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
1718T9G1 Logo.jpg


HOME

 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER


Problem & Motivation

Flight delays has been a very common problem for travelers, the delay can be attributable to various problems, such as, aircraft issues, weather issues at origin airport or/and destination airport. The delay has no doubts will disappoint air travelers and affect their flight experience greatly. Thus, in this project, our team aims to investigate the performance of different airlines and flight delays in detail.

In addition, airport network is a very critical and complex transportation infrastructure for a nation, it is increasingly important for public policy considerations. The disruptions of the airport network, caused by terrorist attack, disease transmission or other reasons, can lead to huge economic loss. Thus, the study on the airport network can assist us better understand the relationship between different airports, for example, identify most critical airport, and take proactive measures to prevent occurrence of disruptions.

Objectives

In this project, we will adopt visualization techniques to:

  • Demographics of student alcohol consumptionAnalyse airport network connectivity
  • Analyse flight delays for different airlines
  • Evaluate on-time performance for airlines and aircrafts

With the visualization, airline companies will become aware of its on-time performance among all airlines and meanwhile have a better idea on areas where greater attention should be placed on routine operation, such as service or aircraft maintenance. Our visualization will also provide a detailed insight on airport network, it will speed up the decision making process when faced with infectious diseases and terrorist attacks.

Selected Dataset

We have obtained the dataset from Kaggle, which can be download from https://www.kaggle.com/usdot/flight-delays/data

Dataset/Source Data Attributes Rationale Of Usage
airline.csv
  • IATA_Code, String, Airline identifier
  • Airline, String, Airport Name
This data is used to identify and provide detailed information about the different airlines.
airport.csv
  • IATA_Code, String, Location identifier
  • Airport, String, Airport Name
  • City, String, City of Airport
  • State, String, State of Airport
  • Country, String, State of Airport
  • Latitude, Numeric, Latitude of the Airport
  • Longitude, Numeric, Longitude of the Airport
This data is used to identify and provide detailed information about different airport. It complements the main dataset by providing detailed location information about latitude and longitude, city, state and country of the airport.
flights.csv
  • Year, Numeric, Year of the flight
  • Month, Numeric, Month of the flight
  • Day, Numeric, Day of the flight
  • Day_of_Week, Numeric, Day of week of the flight
  • Airline, String, Airline identifier
  • Tail_Number, String, Aircraft identifier
  • Origin_Airport, String, Departing airport
  • Destination_Airport, String, Destination airport
  • Departure_Delary, Numeric, Total delay on Departure, negative value indicates the flight departs before scheduled time
  • Arrival_Delay, Numeric, Total delay on arrival, it is derived from the difference of arrival_time and scheduled_arrival, negative value
  • indicates the flight arrived before scheduled time.
  • Diverted,Numeric (binary data), Aircraft landed on airport that out of schedule
  • Cancelled, Numeric (binary data), 1 means cancelled
  • Cancellation_Reason, String, Reason for Cancellation of flight: A - Airline/Carrier; B - Weather; C - National Air System; D -

Security

  • Air_System_Delay, String, Delay caused by air system
  • Security_Delay, String, Delay caused by security
  • Airline_Delay, String, Delay caused by airline
  • Late_Aircraft_Delay, String, Delay caused by aircraft
  • Weather_Delay, String, Delay caused by weather
This data is used as the major source of information in our project. We mainly use this data to analyse flight delays and reasons of delay. In addition, the data will be used investigate airport network and analyse airport network relationship by different centrality measures, such as betweenness centrality, degree centrality.

Background Survey of Related Work

Related Works What We Can Learn

Monthly Performance of Airline in Asia Pacific

1718T1G9 BackgroundSurvey1.png

Source: https://www.flightstats.com/company/monthly-performance-reports/airlines/

  • The heatmap provides a clear annotation from which viewers know the size stands for the scheduled flights whereas color for on-time performance.
  • The colors are well contrast with each other

Trends in the Causes of Flight Delay in US

1718T1G9 BackgroundSurvey2.png

Source: https://www.rita.dot.gov/bts/sites/rita.dot.gov.bts/files/2012_04_13.pdf

  • The use of line chart is effective in comparing the various delay causes
  • The chart title is clear enough to demonstrate the chart purpose

Global Digital Attack Network

1718T1G9 BackgroundSurvey3.png

Source: http://www.digitalattackmap.com/#anim=1&color=0&country=US&list=0&time=17475&view=map

  • The graph vividly displays the path with its origin and destination
  • When mouse hovers on the path, the label shows up with its detailed information

Proposed Dashboard

Proposed Layout How Analyst Conduct Analysis

Dashboard of Flight Route and Arrival Delay By Airline

1718T1G9 ProposedDashboard1.png

The 2-columns dashboard will provide reader a brief layout of flight route map for the selected city. The chart on the right-hand side will give reader an overview of the average arrival delay of specific airline that depart from selected origin point.

In this dashboard, filter will also be provided to update the dashboard, so that readers can see and compare route maps or average arrival delay between different city.

Dashboard of Last Aircraft Delay by Airline and Aircraft

1718T1G9 ProposedDashboard2.png

There are two bar charts in the dashboard. The bar chart at the top displays the sum of delay (in minute) caused by aircraft by airline in US, Jan, 2015 in descending order. The bar chart at the bottom shows the sum of delay (in minute) caused by aircraft by aircraft in US, Jan, 2015 in descending order.

In the dashboard, when user can hover over the bars in the bar chart at the top, the corresponding tip will show up and the bar chart at the bottom will also be filtered. With this, user will know which airline has the worst on-time performance due to aircraft and which aircraft contributes most to the airline’s delay.

Technical Complexity

Below are the list of technical challenges that team may be faced with when developing the visualization application.

Technical Challenges How To Resolve
Unfamiliar with D3.js libraries and building D3 application
  • Attend D3.js workshop
  • Individual learning on how to build D3 application
  • Peer Learning
Lack of knowledge on how to integrate tableau work with D3 application
  • Research on how to integrate tableau work with D3 application
  • Conduct early integration so that team have enough time to tackle some potential errors.
Insufficient metadata for the source data
  • Research in the official website of US Department of Transportation
  • Arrange team discussion to facilitate the understanding for the source data

Tools/Technology

Below are the tools/technologies we will use when developing the visualization

  • Excel
  • Tableau
  • D3.js
  • Gephi

Project Milestones

1718g1t9 milestones.png

References

Comments