G7 Exploratory Analysis

From Analytics Practicum
Jump to navigation Jump to search
DHL Common Banner.png

HOME

 

PROJECT OVERVIEW

 

ANALYSIS & FINDINGS

 

PROJECT MANAGEMENT

 

ABOUT US

 

PRACTICUM HOMEPAGE

 

Business Unit Exploration

DHL BUs.png

We wanted to identify the major Business Units in our dataset. BU 4 was the top business unit in terms of number of shipments for the years 2015 & 2016 and ranked second in 2017 accounting for 43.02% of the total shipments in the data. BU 2 ranked second for the years 2015 & 2016 and ranked first for the year 2017 accounting for 28.33% of the total shipments in the data.
Current Operational Performance

DHL Status.png

Next, we wanted to get a glimpse of the current operational performance across the 3 years. The following table shows that 40% of the shipments were delayed in our dataset and the performance does not vary much across the 3 years.
Operational Performance/BUs

DHL BUStatus.png

Next, we wanted to check the difference in the operational performance across different business units. The above chart does show a very different performance for different Business units but we will have to conduct statistical tests to confirm our hypothesis which is further covered under the Confirmatory Analysis section.
Operational Performance/Flight Types

DHL TypeStatus.png

We can see from the graph above that the passenger flights perform slightly better than cargo flights but the difference is very small so we will need to test if it is statistically significant or not.
Distribution of Delay Days

DHL DelayDays.png

We also wanted to see by how many days were the above shipments delayed. We found 1 outlier value of 364 days which we removed. The above chart shows the descriptive statistics for the distribution of delayed days. Since the mean and the median values were quite different so we decided to check for skewness and plotted a histogram.

DHL DelayDays2.png

The histogram shows that the delay days are heavily skewed with a long right tail (skew = +2.77) as expected so we cannot conduct parametric statistical tests using this distribution and we will use median instead of mean as our point estimator.