ANLY482 AY2017-18T2 Group 11 EDA
HOME | PROJECT OVERVIEW | ANALYSIS & INSIGHTS | PROJECT MANAGEMENT | DOCUMENTATION | ANLY482 MAIN |
EDA | ANALYSIS & INSIGHTS |
---|
Overview
Exploratory Data Analysis (EDA) – Overview |
Being relatively new in the last mile delivery sector, it is important for Company ABC to maintain an ideal level of efficiency while keeping cost low to ensure profitability. Thus, in order to keep cost low, Company ABC decided to approach the 2 types of service deliveries differently. The Standard Service delivery would be outsourced to Contracted drivers while the Express Service deliveries would primarily fall on the In-House delivery couriers. Thus, in order to derive an accurate depiction of these 2 driver types, the later parts of the EDA section will analyse these 2 driver types separately. This analysis will be done on both temporal and spatial scale. However, prior to that, the upcoming section will first aim to provide an overview of the deliveries conducted by ABC.
Being relatively new in the last mile delivery sector, it is important for Company ABC to maintain an ideal level of efficiency while keeping cost low to ensure profitability. Thus, in order to keep cost low, Company ABC decided to approach the 2 types of service deliveries differently. The Standard Service delivery would be outsourced to Contracted drivers while the Express Service deliveries would primarily fall on the In-House delivery couriers. Thus, in order to derive an accurate depiction of these 2 driver types, the later parts of the EDA section will analyse these 2 driver types separately. This analysis will be done on both temporal and spatial scale. However, prior to that, the upcoming section will first aim to provide an overview of the deliveries conducted by ABC.
Figure 5 shows the summary statistics for both Contractor drivers and In-House drivers based on the quantity for each type of driver, service type and size of parcel. From the data-table, the team identified that majority of the parcels are small in size and are served by contractor drivers under service type “Z”.
However, this concentration of parcel size within the “Small” category is not an unexpected phenomenon as Figure 6 indicated that 85.74% of the total parcels delivered are “Small” in size. This could be the reason why contracted drivers are primarily delivering “Small”-sized parcels.
In addition, as shown in Figure 7, 99.54% of all deliveries are conducted by Contracted drivers. In comparison, only 0.46% of all deliveries are conducted by In-House drivers. As such, this indicates that the services conducted for Contracted drivers are highly sought after, which justifies the large number of deliveries shown in Figure 4.
EDA - Spatial Distribution |
Spital Distribution
Location - Contractor Drivers
The deliveries conducted by the Contracted drivers were analysed on a spatial scale. The hierarchy starting from parcel distribution by Zone Code (comprising of West and Central) was then further sub-divided and analysed in terms of District Code, Postal Zone and lastly, Postal Code. Based on the tree-maps generated above, Figure 8 to 10 indicates that the highest volume for the Contracted drivers falls on Western Region, District Code 22 and Postal Zone 64. On the other hand, Postal Zone 62 have the lowest volume within District Code 22.
In addition, as District Code 22 had the highest number of deliveries, the team expected one of the Postal Code within this district to be the one experiencing the highest number of delivery. However, Figure 11 indicated that the individual Postal Code that faced the highest number of delivery falls on Postal Code 228241 and this is a Postal Code within District Code 9; Which is a far cry from the team’s expected zone. As such, the team will analyse this Postal Code and its Zone in greater depth.
Based on the analysis of Postal Zone 22 (Figure 12), the team identified several hotspots/ Postal Codes in which deliveries can be as high as 626 during the period of January to November 2017. Based on the observation, Postal Code 228241 represents one of those location facing higher parcel deliveries. Hence, the deliveries in 2017 for this Postal Code is analysed.
Based on Figure 13, the team observed that there are many days in which the number of records is above “1”. This means that there are multiple delivery orders to the same location in a particular day. This seems rather counter-intuitive to the team as it hints that the driver visits the same location multiple times to accomplish the parcel deliveries. As such, the team investigated this phenomenon and understands that it is possible to have multiple deliveries to the same building within a single trip. For example, the driver can accomplish 2 deliveries within the same building that are at 2 different levels.
Besides using quantity as the variable within the tree-map, the team also explored using “Size” as an added dimension to the tree-map and the results are as follows.
After using size as a dimension, Postal Zone 22 is still one of the more interesting zones as it faces a relatively small parcel volume yet have a concentration that is comparable to other Postal Zones. As shown above, the concentration of parcels within Postal Zone 22 seems to be in Zip Code 228241 where majority of the parcels that it delivers are Medium sized parcels. This is unlike many other areas where, typically, the number of Small parcels would be the primary parcel size that drivers delivers. As such, the team delved further into this Zip Code.
Based on Figure 15, the team observed that there are several unique characteristics of this Zip Code. Firstly, the Small parcels face lesser variations, unlike the other zones. Secondly, its Medium sized parcels seem to face a higher variation as compared to the other sizes. Lastly, the demand for parcel deliveries of Medium and Large-sized Parcels only began from May. As a result of the variations seen in Medium parcels, this has formed the main reasons for the fluctuations of parcels seen in Figure 13.
Location - InHouse Drivers
The deliveries conducted by the In-House drivers were also analysed on a spatial scale. As shown in Figures 16-19, the hierarchy starts from parcel distribution by Zone Code (comprising of West and Central). It was then further sub-divided and analysed in terms of District Code, Postal Zone and lastly, Postal Code. Figure 16 and 17 depicts that parcels tend to be concentrated in the Western region of Singapore and in particular, delivered to District Zone 1. When subdivided into the Zone Code, as shown in Figure 18, majority of the parcels are delivered to Postal Code 4, 23, 11. Thus, one would expect that a Zip Code within any of this 3 Postal Code to have a high concentration of parcels. However, just like in the case of the Contracted drivers, majority of the parcels are concentrated at a Zip Code that does not fall within these 3 Postal Code. As shown in Figure 19, when subdivided into the individual Zip Code, parcels are concentrated in 079912 and thus, this Zip Code will be further elaborated in the later parts of this EDA.
Based on Figure 20, the team observed that most of the parcels within Postal Zone 7 are concentrated at one spot; Which happens to be Zip Code 079912. The deliveries for this zone, as shown in Figure 21, is one that is in constant flux and typically seem to hover around the values of “1” to “4”.
As displayed in Figure 22 and 23, after adding the “Size” dimension into our EDA, the team observed that Zip Code 079912’s parcels are contributed by the high volume from Medium and Large parcels.
Figure 24 shows an in-depth analysis of Zip Code 079912. Based on the figure, the team observed that the Small parcels have remained relatively stagnant while Medium and Large parcels have seen slightly more fluctuations. However, these fluctuations are still less dramatic than those experienced by Contractor drivers, as mentioned earlier.
Failed Delivery - Contractor Drivers
After analysing the distribution of parcels based on location, the following sections will elaborate on the deliveries which have been failed by contractor drivers. To define, failed delivery occurs when a delivery trip is unsuccessful in delivering the parcel to the downstream customer and a redelivery trip has to be made. Failed delivery is something any courier servicer wishes to minimize, as it would result in higher operational costs.
Figure 25 depicts the number of failed deliveries by each postal zone. The team noticed that there is a total of 11 Postal Zones that have failed parcels exceeding the average. Of this 11, Postal Zone 60 and 64 are the two zones with the highest absolute amount of failed deliveries.
However, the absolute value of failed deliveries is highly dependent on the total parcels delivered. As such, using the absolute failed deliveries would be an inaccurate measure of each zone’s level of failed deliveries. Thus, the team decided to explore deeper into this problem of failed deliveries by using the percentage of failed deliveries as a measure instead.
Based on Figure 26, the team identified Zone 63 as the Zone with the highest level of delivery failure. This is in stark contrast with the findings obtained from Figure 25, and as such, the team will endeavour to find out more about the reasons that resulted in this high percentage of failure rate.
As shown in Figure 27 and 28, the team attempted to identify the Zip Code which has caused this high percentage of failed delivery in Postal Zone 63. These charts indicated that Zip Code 639798 is the primary cause for these failed deliveries. As such, through the help of Google Maps, the team identified this area to be within the Nanyang Technological University Campus. These are campus areas where the majority of its target customers in these areas are students and often hard to locate.
Failed Delivery - InHouseDrivers
As seen from Figure 29 to 31, the team attempted to identify the trend in failed deliveries by In-House drivers. However, the 3 charts failed to provide any room for analysis due to the small number of parcel deliveries. As such, the team failed to derive any conclusion pertaining to the failed deliveries for In-House drivers.
EDA - Temporal Distribution |
Temporal Distribution
Overview
As shown in Figure 32 to 34, the team looked at the delivery records across the months, weeks and days from Jan 2017 till Nov 2017. By doing so, the team noticed that all 3 graphs indicated a significant drop in the Number of Records from 1012 in 24th Jan 2017 to 34 in 31st Jan 2017. Based on the team’s research, this sharp fall in Number of Records coincided with the Chinese New Year in 2017 and could thus, possibly be the reason for this decline.
In addition, Figure 32 to 34 suggest that there is a general increase in the quantity of parcels being delivered by Company ABC. From Figure 34, the team observed that there is an increase in approximately 300 deliveries just in the first half of 2017. This trend seems to be continuing, as shown in Figure 34, where the quantity per day in November started to reach approximately 1000 deliveries per day.
Figure 36 shows the summary table for all services conducted from Jan 2017 till Nov 2017. In total, there are 6 types of Standard Services and 4 types of Express Services. Based on the table, the team observed that the service known as “Standard 1-3 days” enjoy an increase from 3,124 to 13,245. In addition, the team also recognised that the amount of parcel delivered in Nov 2017 by the “Standard 1-3 days” service is quite similar to those delivered by “Standard Next Day”. Lastly, the team also observed that the “Standard – Schedule (0900-1200)” is increasing at a slower rate as compared to the “Standard- 1-3 days”. Thus, given the dynamics and trends observed earlier, it will be interesting to investigate these details on a temporal scale; which the team will be exploring in the later sections.
Service Type - Contractor Drivers
From Figure 37, the Standard Next Day Service (grey line) is at a much higher scale as compared to the Standard Scheduled Service. As such, in order to have a clear picture of the delivery distribution for the various standard scheduled service, the team excluded the Standard Next Day Service and plotted the line graph again. The newly plotted figure, Figure 38, illustrated that the Standard-scheduled (0900-1200hours) is the service customers sought after.
In addition, the team also performed an analysis on the variability for the individual services provided by Company ABC. From Figure 39, the team observed that “Standard Next Day” services and “Standard 1-3days Service” have higher levels of variability as compared to the “Standard Scheduled” and “Express Services”.
Furthermore, Figure 40 indicates that the variability for “Standard Next Day” service and “Standard 1-3 Days” service is highly similar. On the other hand, Figure 41 illustrate that “Standard Scheduled Service” for time slots (0900-1200hours) and (1800-2200hours) experienced much higher variability as compared to the time slots (1200-1500hours) and (1500-1800hours). With this high variability, the team hypothesise that “Standard Scheduled Service” during (0900 – 1200 hours) and (1800 – 2200 hours) can experience a sudden surge in demand which if not managed properly, could cause bottleneck issues to arise.
Service Type - InHouse Drivers
From Figure 42, the team observed that the bulk of express service consists of Express Service – 6hrs, followed by Express Service – 2hrs. When the team narrow the data further into specific dates (as shown in Figure 43), the team observed that the Express Service – 6hrs has a higher variability. On the other hand, the rest of the services provided by In-House drivers have low variability. The team, thus, conclude that the main service that is typically demanded of In-House drivers will be the Express Service – 6hrs.
Size - Contractor Drivers
In addition to understanding the various service codes, it is imperative to analyse the size of parcels that are assigned to the 2 drivers. As such, the following exploration has been conducted.
Figure 44 shows that Extra Small parcels has been falling over the months from Jan 2017 to Apr 2017 before eventually, coming to a complete stop in May 2017. In addition, the team observed an increasing trend for all other sizes, with the highest increase in Small parcels in year 2017. As a result, Company ABC experience its parcel volumes tripling in Nov 2017 as compared to Feb 2017. To have a better visualisation, a graphical representation of the data-table titled Figure 45, is shown below.
During the EDA, the team has excluded data points from Extra Small size since its value decreased to 0 by May 2017 and as such, would not be meaningful to include it in the analysis. Also, as the previous EDA (shown in Figure 45) was conducted on a monthly basis, the team then expanded on the data to observe if there are any trends on a smaller scale.
From Figure 46, we can see that the scale for small parcel is different from the rest, and hence the team split the different sizes into individual line graphs. This will allow the team to analysis them separately and to observe any trends.
While looking at Figure 48, which is a magnified version of the graph above it, it can be seen that the line graphs follows a general trend, by which there is a certain pattern to it. Upon closer investigation, the deliveries have the following pattern:
- A peak point in the number of deliveries
- A sharp decrease in number of deliveries the next day
- A recovery in number of deliveries
- A sharp decrease in number of deliveries again
- Further slight decrease in number of deliveries
Example of these 5 points are indicated in Figure 48.
Although the scale of the line graphs for the various sizes are different, this pattern can be observed in all the line graphs in general. The team has included a second variable to the line graphs, namely the district zones, to determine if the pattern is consistent in all district zones, or if there is any spatial biasness against time. The line graphs with the inclusion of the second variable can be observed in Figure 49, where the pattern can once again, be observed.
In order to have a deeper understanding of the aforementioned pattern, the team plotted an area graph.
From Figure 50, the team observed that majority of the weeks follow the 5-step pattern mentioned earlier. This indicates that there could be seasonality factors invovled and this pattern could be observed again in the later sections of the EDA.
Size- InHouse Drivers
While analysing the data for In-House drivers (Figure 51), the team found out that “Small” parcels tend to face the highest variability. This variability seems to mirror that of the In-House driver’s variability observed in Figure 43. As such, given this similarity, the team hypothesis that the majority of the parcels handled by In-House drivers are express service – 6hrs (Figure 43) as well as are small in size (Figure 51).