ShinyNET Data Prep Report
shinyNET:A web-based flight data visualisation toolkit using R Shiny and ggraph(Group 14)
|
|
|
|
|
Contents
Motivation of the Application
Airline industry in India is booming. Within the last 20 years, the industry dynamics have changed from monopoly and it is marching towards a perfect competition. 15 years before there was only one Indian airline (Air India) controlled by the government. A common person was not able to afford a flight transit. Now the affordability of a common person in India has increased. Airline is becoming a more common transit that connects different parts of the country. There are 12 domestic commercial airlines in India currently. Some do serve only certain geographic regions.
What do we infer from this? From an entrepreneurial perspective, it is not hard for an entrepreneur to start an airline business unlike 20 years before. Some of the questions that can arise for an entrepreneur aiming for an airline business are:
1. Which demographic region should I focus on?
2. With which airlines should I partner with for increasing geographic coverage?
3. What is the air traffic over the hours in a day?
4. How can you optimize air traffic?
To answer all the above questions a complete visualization of the airline network including all the 12 airlines are required. Currently there are no visualization that shows the holistic knowledge of the Indian Airline network.
Also, the visualization is a combination of geospatial and social network. Geospatial provides the location of the airports and social network provides how each and other airports are connected. Hence, the above entrepreneurial and business factors helped us to take this project and create a visualization for the same.
Review and critic on past works
Our dashboard draws it's inspiration from the research done by "National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India" where they published a paper on the analysis of the airport network of India as a complex weighted network. Although research has been done in this area to analyze the airport network in India, visualization of the network is one thing that is lacking. Most of the past works has been focused on only the airport network but no focus on the airline networks. Airline network are the most important part for an airport and visualizing the traffic dynamics using Geo spatial and graph network will provide new insights for the airline industry.
We choose R-Shiny to achieve our visualization goals because of its powerful, flexible tools and packages that it offers. R with its different network and Geo spatial packages provides a rich environment for a data analyst to perform analysis, combining both network and Geo spatial data.
Design Framework
Demonstration
Discussion
From the network data, we get these following insights
- Indigo holds the major market share and the highest connectivity. The connectivity in the above image shows how demanding Indigo flights are as they ease the flight transit.
- Spice Jet, Jet Airways, Go Air and Air India also have a very high connectivity followed by Indigo.
- Jet Airways, does not cover much of north east India. But they cover the major metropolitan cities.
- Tier 1 cities (Delhi, Mumbai, Kolkata, Chennai, Bangalore, Hyderabad) have the highest connectivity.
- JetLite, Vistara covers only major cities (some tier 1 and other tier 2 cities).
- Air Asia covers Tier 1 and some major cities. It does not cover central India.
- Trujet, Zoom Air, Air Carnival, Alliance Air work in specific geographic areas.
- Zoom Air covers only 3 airports.
- Air Carnival mostly focusses on Tamil Nadu.
- Alliance Air, connecting tier2 and tier 3 cities to Metro cities. They have very less flights compared to Indigo but are spread out across many of the geographic regions. Hence, they are very cost efficient.
- Alliance Air is the only air carrier connecting to Agatti, Lakshadweep. This can provide a competitive advantage to Alliance Air compared to other Air Carriers.
- By looking at the above facet, Alliance, Air Carnival, Trujet, Zoom Air can partner to compete airlines like Air Asia and Jet Lite.
This geospatial visualisation can help provide useful information on the following the aspects:
- It can provide which airlines can partner together to get more value at a minimized cost.
- The visualisation can help in scheduling the flights for efficient flight transit.
- Based on network graphs, the visualisation can provide betweeness centrality and closeness centrality. Closeness centrality is a measure how close an airport is compared to the other airports. Betweeness centrality is a measure of how flights are dependent on this airport for source and destination. Hypothetically, the higher the closeness centrality, the higher should be the betweeness centrality in order to make the route planning more efficient. This visualization provides betweeness centrality and closeness centrality of cities at different tier levels to optimize route planning.
- The visualisation can provide the flight traffic at each and every hour of the day.
Thus the visualisation provides value to 12 airline carriers. If the scope is expanded globally, then this visualisation will create a huge impact in providing value to 365 and more airline companies.
Future Work
- Addition of day filter will help analyst to analyze the airline network as per different their frequency. i.e weekday and weekend.
- Different network measurements such as: eigen, betweeness, closeness, hub, authority and page rank can be incorporated into the dashboard which will help the analyst to do network analysis.
- optimizing the code to reduce the time taken to generate the results.
Installation Guide
The packages that are required to be installed are as follows: library(shiny) library(ggplot2) library(tidyverse) library(ggmap) library(tidygraph) library(ggraph) library(plotly) library(shinydashboard) library(DT) library(ggiraph)
References
http://personal.tcu.edu/kylewalker/interactive-flow-visualization-in-r.html
http://minimaxir.com/2016/12/interactive-network/
http://rstudio-pubs-static.s3.amazonaws.com/150541_73e40e5911ef4e2cab6cf018dae10c60.html
https://devpost.com/software/student-network-vis
http://blog.nycdatascience.com/student-works/why-are-airports-important/
http://www.sscnet.ucla.edu/soc/faculty/mcfarland/soc112/cent-ans.htm
http://curleylab.psych.columbia.edu/netviz/netviz1.html#/52
https://briatte.github.io/ggnetwork/
http://www.data-imaginist.com/2017/Introducing-tidygraph/
http://dgca.nic.in/dom_flt_schedule/flt_index.htm
https://stackoverflow.com/questions/16713354/using-ggmap-map-of-the-world
http://www.milanor.net/blog/maps-in-r-plotting-data-points-on-a-map/
https://cran.r-project.org/web/packages/ggCompNet/vignettes/examples-from-paper.html
https://stackoverflow.com/questions/35960170/igraph-add-to-geographic-map
http://r.prevos.net/create-air-travel-route-maps/
https://rstudio-pubs-static.s3.amazonaws.com/98122_61f7e34c0d62417d98a2fa12f5bbf51e.html
https://rud.is/projects/clinton_emails_01.html
http://rpubs.com/insight/leaflet
https://stackoverflow.com/questions/35143155/leaflet-colours-for-polylines
https://cran.r-project.org/web/packages/ggiraph/vignettes/shiny_usage.html
https://stackoverflow.com/questions/27965931/tooltip-when-you-mouseover-a-ggplot-on-shiny