ANLY482 AY2017-18T2 Group 11 EDA

From Analytics Practicum
Jump to navigation Jump to search
T.W.O Banner.png
HOME PROJECT OVERVIEW ANALYSIS & INSIGHTS PROJECT MANAGEMENT DOCUMENTATION ANLY482 MAIN


EDA ANALYSIS & INSIGHTS
Exploratory Data Analysis (EDA) – Overview

Being relatively new in the last mile delivery sector, it is important for Company ABC to maintain an ideal level of efficiency while keeping cost low to ensure profitability. Thus, in order to keep cost low, Company ABC decided to approach the 2 types of service deliveries differently. The Standard Service delivery would be outsourced to Contracted drivers while the Express Service deliveries would primarily fall on the In-House delivery couriers. Thus, in order to derive an accurate depiction of these 2 driver types, the later parts of the EDA section will analyse these 2 driver types separately. This analysis will be done on both temporal and spatial scale. However, prior to that, the upcoming section will first aim to provide an overview of the deliveries conducted by ABC.

EDA Overview.png

Being relatively new in the last mile delivery sector, it is important for Company ABC to maintain an ideal level of efficiency while keeping cost low to ensure profitability. Thus, in order to keep cost low, Company ABC decided to approach the 2 types of service deliveries differently. The Standard Service delivery would be outsourced to Contracted drivers while the Express Service deliveries would primarily fall on the In-House delivery couriers. Thus, in order to derive an accurate depiction of these 2 driver types, the later parts of the EDA section will analyse these 2 driver types separately. This analysis will be done on both temporal and spatial scale. However, prior to that, the upcoming section will first aim to provide an overview of the deliveries conducted by ABC.

Figure 5 shows the summary statistics for both Contractor drivers and In-House drivers based on the quantity for each type of driver, service type and size of parcel. From the data-table, the team identified that majority of the parcels are small in size and are served by contractor drivers under service type “Z”.

However, this concentration of parcel size within the “Small” category is not an unexpected phenomenon as Figure 6 indicated that 85.74% of the total parcels delivered are “Small” in size. This could be the reason why contracted drivers are primarily delivering “Small”-sized parcels.

OverviewDriverType.png

In addition, as shown in Figure 7, 99.54% of all deliveries are conducted by Contracted drivers. In comparison, only 0.46% of all deliveries are conducted by In-House drivers. As such, this indicates that the services conducted for Contracted drivers are highly sought after, which justifies the large number of deliveries shown in Figure 4.


EDA - Spatial Distribution

Spital Distribution

Location - Contractor Drivers

SpatialLocationByContractor.png

The deliveries conducted by the Contracted drivers were analysed on a spatial scale. The hierarchy starting from parcel distribution by Zone Code (comprising of West and Central) was then further sub-divided and analysed in terms of District Code, Postal Zone and lastly, Postal Code. Based on the tree-maps generated above, Figure 8 to 10 indicates that the highest volume for the Contracted drivers falls on Western Region, District Code 22 and Postal Zone 64. On the other hand, Postal Zone 62 have the lowest volume within District Code 22.

In addition, as District Code 22 had the highest number of deliveries, the team expected one of the Postal Code within this district to be the one experiencing the highest number of delivery. However, Figure 11 indicated that the individual Postal Code that faced the highest number of delivery falls on Postal Code 228241 and this is a Postal Code within District Code 9; Which is a far cry from the team’s expected zone. As such, the team will analyse this Postal Code and its Zone in greater depth.

ContractorFailPostalZone.png

Based on the analysis of Postal Zone 22 (Figure 12), the team identified several hotspots/ Postal Codes in which deliveries can be as high as 626 during the period of January to November 2017. Based on the observation, Postal Code 228241 represents one of those location facing higher parcel deliveries. Hence, the deliveries in 2017 for this Postal Code is analysed.

ContractorFailDistribution.png

Based on Figure 13, the team observed that there are many days in which the number of records is above “1”. This means that there are multiple delivery orders to the same location in a particular day. This seems rather counter-intuitive to the team as it hints that the driver visits the same location multiple times to accomplish the parcel deliveries. As such, the team investigated this phenomenon and understands that it is possible to have multiple deliveries to the same building within a single trip. For example, the driver can accomplish 2 deliveries within the same building that are at 2 different levels.

Besides using quantity as the variable within the tree-map, the team also explored using “Size” as an added dimension to the tree-map and the results are as follows.

ContractorFailSizeDistribution.png

After using size as a dimension, Postal Zone 22 is still one of the more interesting zones as it faces a relatively small parcel volume yet have a concentration that is comparable to other Postal Zones. As shown above, the concentration of parcels within Postal Zone 22 seems to be in Zip Code 228241 where majority of the parcels that it delivers are Medium sized parcels. This is unlike many other areas where, typically, the number of Small parcels would be the primary parcel size that drivers delivers. As such, the team delved further into this Zip Code.

ContractorFailSizeDistributions.png

Based on Figure 15, the team observed that there are several unique characteristics of this Zip Code. Firstly, the Small parcels face lesser variations, unlike the other zones. Secondly, its Medium sized parcels seem to face a higher variation as compared to the other sizes. Lastly, the demand for parcel deliveries of Medium and Large-sized Parcels only began from May. As a result of the variations seen in Medium parcels, this has formed the main reasons for the fluctuations of parcels seen in Figure 13.

Location - InHouse Drivers

SpatialLocationByInhouse.png

The deliveries conducted by the In-House drivers were also analysed on a spatial scale. As shown in Figures 16-19, the hierarchy starts from parcel distribution by Zone Code (comprising of West and Central). It was then further sub-divided and analysed in terms of District Code, Postal Zone and lastly, Postal Code. Figure 16 and 17 depicts that parcels tend to be concentrated in the Western region of Singapore and in particular, delivered to District Zone 1. When subdivided into the Zone Code, as shown in Figure 18, majority of the parcels are delivered to Postal Code 4, 23, 11. Thus, one would expect that a Zip Code within any of this 3 Postal Code to have a high concentration of parcels. However, just like in the case of the Contracted drivers, majority of the parcels are concentrated at a Zip Code that does not fall within these 3 Postal Code. As shown in Figure 19, when subdivided into the individual Zip Code, parcels are concentrated in 079912 and thus, this Zip Code will be further elaborated in the later parts of this EDA.

InhouseFailPostalZone.png
InhouseFailDistribution.png

Based on Figure 20, the team observed that most of the parcels within Postal Zone 7 are concentrated at one spot; Which happens to be Zip Code 079912. The deliveries for this zone, as shown in Figure 21, is one that is in constant flux and typically seem to hover around the values of “1” to “4”.

InhouseSizeFailDistribution.png

As displayed in Figure 22 and 23, after adding the “Size” dimension into our EDA, the team observed that Zip Code 079912’s parcels are contributed by the high volume from Medium and Large parcels.

InhouseSizeFailDistributions.png

Figure 24 shows an in-depth analysis of Zip Code 079912. Based on the figure, the team observed that the Small parcels have remained relatively stagnant while Medium and Large parcels have seen slightly more fluctuations. However, these fluctuations are still less dramatic than those experienced by Contractor drivers, as mentioned earlier.

Failed Delivery - Contractor Drivers

After analysing the distribution of parcels based on location, the following sections will elaborate on the deliveries which have been failed by contractor drivers. To define, failed delivery occurs when a delivery trip is unsuccessful in delivering the parcel to the downstream customer and a redelivery trip has to be made. Failed delivery is something any courier servicer wishes to minimize, as it would result in higher operational costs.

TWOFigure25.png
TWOFigure26.png

Figure 25 depicts the number of failed deliveries by each postal zone. The team noticed that there is a total of 11 Postal Zones that have failed parcels exceeding the average. Of this 11, Postal Zone 60 and 64 are the two zones with the highest absolute amount of failed deliveries.

However, the absolute value of failed deliveries is highly dependent on the total parcels delivered. As such, using the absolute failed deliveries would be an inaccurate measure of each zone’s level of failed deliveries. Thus, the team decided to explore deeper into this problem of failed deliveries by using the percentage of failed deliveries as a measure instead.

Based on Figure 26, the team identified Zone 63 as the Zone with the highest level of delivery failure. This is in stark contrast with the findings obtained from Figure 25, and as such, the team will endeavour to find out more about the reasons that resulted in this high percentage of failure rate.

TwoFigure27.png

As shown in Figure 27 and 28, the team attempted to identify the Zip Code which has caused this high percentage of failed delivery in Postal Zone 63. These charts indicated that Zip Code 639798 is the primary cause for these failed deliveries. As such, through the help of Google Maps, the team identified this area to be within the Nanyang Technological University Campus. These are campus areas where the majority of its target customers in these areas are students and often hard to locate.

Failed Delivery - InHouseDrivers

TwoFigure29.png

As seen from Figure 29 to 31, the team attempted to identify the trend in failed deliveries by In-House drivers. However, the 3 charts failed to provide any room for analysis due to the small number of parcel deliveries. As such, the team failed to derive any conclusion pertaining to the failed deliveries for In-House drivers.


EDA - Temporal Distribution

Temporal Distribution

Overview

TwoFigure32.png
TwoFigure34.png

As shown in Figure 32 to 34, the team looked at the delivery records across the months, weeks and days from Jan 2017 till Nov 2017. By doing so, the team noticed that all 3 graphs indicated a significant drop in the Number of Records from 1012 in 24th Jan 2017 to 34 in 31st Jan 2017. Based on the team’s research, this sharp fall in Number of Records coincided with the Chinese New Year in 2017 and could thus, possibly be the reason for this decline.

In addition, Figure 32 to 34 suggest that there is a general increase in the quantity of parcels being delivered by Company ABC. From Figure 34, the team observed that there is an increase in approximately 300 deliveries just in the first half of 2017. This trend seems to be continuing, as shown in Figure 34, where the quantity per day in November started to reach approximately 1000 deliveries per day.

TwoFigure36.png

Figure 36 shows the summary table for all services conducted from Jan 2017 till Nov 2017. In total, there are 6 types of Standard Services and 4 types of Express Services. Based on the table, the team observed that the service known as “Standard 1-3 days” enjoy an increase from 3,124 to 13,245. In addition, the team also recognised that the amount of parcel delivered in Nov 2017 by the “Standard 1-3 days” service is quite similar to those delivered by “Standard Next Day”. Lastly, the team also observed that the “Standard – Schedule (0900-1200)” is increasing at a slower rate as compared to the “Standard- 1-3 days”. Thus, given the dynamics and trends observed earlier, it will be interesting to investigate these details on a temporal scale; which the team will be exploring in the later sections.

Service Type - Contractor Drivers

TwoFigure37.png

From Figure 37, the Standard Next Day Service (grey line) is at a much higher scale as compared to the Standard Scheduled Service. As such, in order to have a clear picture of the delivery distribution for the various standard scheduled service, the team excluded the Standard Next Day Service and plotted the line graph again. The newly plotted figure, Figure 38, illustrated that the Standard-scheduled (0900-1200hours) is the service customers sought after.

TwoFigure39.png
TwoFigure41.png

In addition, the team also performed an analysis on the variability for the individual services provided by Company ABC. From Figure 39, the team observed that “Standard Next Day” services and “Standard 1-3days Service” have higher levels of variability as compared to the “Standard Scheduled” and “Express Services”.

Furthermore, Figure 40 indicates that the variability for “Standard Next Day” service and “Standard 1-3 Days” service is highly similar. On the other hand, Figure 41 illustrate that “Standard Scheduled Service” for time slots (0900-1200hours) and (1800-2200hours) experienced much higher variability as compared to the time slots (1200-1500hours) and (1500-1800hours). With this high variability, the team hypothesise that “Standard Scheduled Service” during (0900 – 1200 hours) and (1800 – 2200 hours) can experience a sudden surge in demand which if not managed properly, could cause bottleneck issues to arise.

Service Type - InHouse Drivers

TwoFigure42.png

From Figure 42, the team observed that the bulk of express service consists of Express Service – 6hrs, followed by Express Service – 2hrs. When the team narrow the data further into specific dates (as shown in Figure 43), the team observed that the Express Service – 6hrs has a higher variability. On the other hand, the rest of the services provided by In-House drivers have low variability. The team, thus, conclude that the main service that is typically demanded of In-House drivers will be the Express Service – 6hrs.

Size - Contractor Drivers

In addition to understanding the various service codes, it is imperative to analyse the size of parcels that are assigned to the 2 drivers. As such, the following exploration has been conducted.

TwoFigure44.png

Figure 44 shows that Extra Small parcels has been falling over the months from Jan 2017 to Apr 2017 before eventually, coming to a complete stop in May 2017. In addition, the team observed an increasing trend for all other sizes, with the highest increase in Small parcels in year 2017. As a result, Company ABC experience its parcel volumes tripling in Nov 2017 as compared to Feb 2017. To have a better visualisation, a graphical representation of the data-table titled Figure 45, is shown below.

TwoFigure45.png

During the EDA, the team has excluded data points from Extra Small size since its value decreased to 0 by May 2017 and as such, would not be meaningful to include it in the analysis. Also, as the previous EDA (shown in Figure 45) was conducted on a monthly basis, the team then expanded on the data to observe if there are any trends on a smaller scale.

TwoFigure46.png

From Figure 46, we can see that the scale for small parcel is different from the rest, and hence the team split the different sizes into individual line graphs. This will allow the team to analysis them separately and to observe any trends.

TwoFigure47.png
TwoFigure48.png

While looking at Figure 48, which is a magnified version of the graph above it, it can be seen that the line graphs follows a general trend, by which there is a certain pattern to it. Upon closer investigation, the deliveries have the following pattern:

  1. A peak point in the number of deliveries
  2. A sharp decrease in number of deliveries the next day
  3. A recovery in number of deliveries
  4. A sharp decrease in number of deliveries again
  5. Further slight decrease in number of deliveries

Example of these 5 points are indicated in Figure 48.

TwoFigure49.png

Although the scale of the line graphs for the various sizes are different, this pattern can be observed in all the line graphs in general. The team has included a second variable to the line graphs, namely the district zones, to determine if the pattern is consistent in all district zones, or if there is any spatial biasness against time. The line graphs with the inclusion of the second variable can be observed in Figure 49, where the pattern can once again, be observed.

In order to have a deeper understanding of the aforementioned pattern, the team plotted an area graph.

TwoFigure50.png

From Figure 50, the team observed that majority of the weeks follow the 5-step pattern mentioned earlier. This indicates that there could be seasonality factors invovled and this pattern could be observed again in the later sections of the EDA.

Size- InHouse Drivers

TwoFigure51.png

While analysing the data for In-House drivers (Figure 51), the team found out that “Small” parcels tend to face the highest variability. This variability seems to mirror that of the In-House driver’s variability observed in Figure 43. As such, given this similarity, the team hypothesis that the majority of the parcels handled by In-House drivers are express service – 6hrs (Figure 43) as well as are small in size (Figure 51).