ANLY482 AY2017-18T2 Group10 Project Overview: Data
Contents
Data Overview
The data provided by the sponsor is in Microsoft Excel format for each outlet by month. For now, they have provided the data for a total of 24 months from Dec 2015 to Dec 2017. The data that was given to us are Inventory Data, Monthly PLU (Programmable Logic Unit) and Sales Data. One limitation is that the company has recently changed the format of the inventory data, and thus we would be working with 2 different formats of inventory data. Below is a short description of the each dataset:
Dataset Name | Dataset Description |
---|---|
Inventory | Describes the inventory order for each outlet the data is updated daily. |
Sales | Describes the sales for each outlet for each month daily. |
MonthPLU | Describes the number of patrons for each outlet according to the type of meal daily. |
PLU Data Cleaning Process
Inventory Data Cleaning Process
As there was a change in format of the Monthly Inventory Data from October 2017 onwards, there are two main different types of formats for the Monthly Inventory Data. The two different formats have different column names and different number of columns. Hence, to perform our analysis and EDA, we had to process the two formats separately. Using Python scripts, we extracted the necessary columns from each file, standardised the column names and compiled them into a giant CSV data file - ‘Inventory_Processed_2016-2017.csv’