ANLY482 AY2017-18T2 Group10 Analysis & Findings

From Analytics Practicum
Jump to navigation Jump to search

Tennet logo.png


HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

BACK TO MAIN ANLY482

EDA

Recommendations

Model

Due to confidentiality, we will not be able to upload any charts onto the wiki. The fully disclosed analysis report is available in our Interim Report submission.

EDA For Inventory Data

80/20 Analysis

Given that the sponsor has a few hundred items in their inventory listing, we would like to identify which items make up a larger percentage of the total usage cost. We could then choose to focus on items which make up the bulk of the inventory costs, as the impact of any inventory forecasting model would be greater for those items.

From our analysis, we have found out that the top 70 items make up 60% of the total usage cost for the past two years, while the top 140 items make up almost 80% of the total usage cost for the same duration. The detailed From this we can tell that a substantial proportion of items make up 80% of the total usage value and thus not quite ‘80-20’.

The main cost drivers are largely the meat items, such as different cuts of lamb and beef, as opposed to the vegetables. Seafood items such as crayfish and prawns also take up a high percentage of total usage cost. This is not surprising since consumers want to maximize their “return” i.e. getting the most value for what they pay for. This leads to higher meat consumption since meat is often seen as “higher value” due to its higher cost. Models to be developed will take this into account and cater more accuracy towards predicting these items.


Total Inventory Usage Across Months Breakdown by Outlet

Next, we want to observe the general trend of inventory usage cost over the months to see if there is any seasonal trend in usage cost. Given that there are only two years worth of data, any trends observed might have other confounding factors and may not be completely attributed to the time of year.

Next, we can observe that the December months incurred the larger amount of inventory costs. On the other hand, the month of March for both years incurred one of the lowest inventory costs. Generally, inventory costs are increasing from the month of April up till December. Next, we also want to investigate the proportion of each outlet’s contribution to the total inventory cost by looking at the 100% stacked chart. We can see that the percentage of each outlet’s contribution to the total inventory cost does not vary substantially even though the actual usage values do fluctuate across the months. This shows that each outlet appears to be affected by the seasonality factors by roughly the same amount.

Finally, we also want to check if the inventory usage for the different outlets are statistically different. To do this we generated the quantiles, connecting letters report and the ordered differences report in JMP Pro 13.

From the connecting letters report we can see that there are two pairs of outlets which are not significantly different from each other - Outlet A - Outlet B and Outlet C- Outlet D. This shows that the outlets located at major heartland areas (A & B) - are not significantly different. While C and D are also both similar and are both located in town. Apart from these two pairs, all other pairs of outlets are significantly different from each other.


Inventory Usage Across Months Breakdown By Product

Subsequently, we wish to check if there are any trends present in the inventory usage for individual products across the months. In general, over the entire year, we observe that there is little trend for all the ingredients on the whole and that some ingredients have a more noticeable trend pattern than others. There are, however, some slight spikes in usage for the months of December and August for a majority of the items.

The high usage items (Usage Value > $10,000) appear to have a more noticeable trend than the others. These items include the “XXX” , “YYY” and the “ZZZ” (Yellow). However, we also observe that these items have been discontinued with effect from May 2017. The low usage items (Usage Value < $10,000) on the other hand, show minimal changes across the months and are fairly constant throughout the year. Another observation is on the introduction of new items such as the “YYY- XXX” which was added to the menu only in April 2017 but has seen steady growth.

EDA For Sales Data

Analysis of Revenue per Outlet per Day

Given that management would like to know if there are discrepancies in the ordering across the various outlets, a revenue analysis was performed to determine each outlet's respective contribution to the total revenue. Interestingly, we note that the top revenue generating outlet overall, was not always the top performing outlet in each individual month. This means that month-on-month performance does tend to fluctuate across outlets.

Next, to develop a more appropriate inventory model, distribution of sales was analysed on a per day basis. From the charts, it can be seen that sales tends to peak during the Friday - Sunday period. Such an observation is not surprising given that this period is considered to be the “weekend” period where customers have more time to spend dining. On closer observation, the distribution of sales across each outlet on a daily basis is relatively consistent, indicating that consumption pattern may not be random. Furthermore, the proportion of customers contributed by each outlet to the total is also fairly consistent throughout the week.

Applying the Tukey-Kramer’s test to generate the Connecting Letters Report and Ordered Differences Report for revenue against day of the week, we note that differences in revenue between days of the week are largely statistically significant (p-value < 0.0001). Thus, it may be said that the day of the week would play a part in predicting inventory usage, since it reflects the consumer demand.

Analysis of No. of Pax per Outlet per Day

From the quantiles output, it can be seen that average number of customers per day does vary quite greatly from outlet to outlet. This is further supported by the Connecting Letters Report and Ordered Differences Report generated in JMP Pro 13. From the Connecting Letters Report, it can be seen that three of the outlets are fairly similar, while the remaining three outlets are dissimilar from the first three outlets as well as each of the other outlets. This is further illustrated by the Ordered Differences Report indicating a p-value of < 0.0001. This means that if outlet name is used as a variable for modelling, these differences must be taken into consideration to ensure a more appropriate model.

EDA For PLU Data

---