Difference between revisions of "ANLY482 AY2017-18 Group9: Project Overview/ Methodology"

Revision as of 16:45, 25 February 2018

HOME

PROJECT OVERVIEW

PROJECT FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

MAIN PAGE

DESCRIPTION

METHODOLOGY

DATA COLLECTION / PREPARATION

Upon understanding the problems faced by KOI and coming up with potential solution for their problems, we requested a list of datasets that we will required to perform our analysis. In particular, we will target sales and wastage data to optimize the reorder inventory.

To facilitate our analysis, KOI has kindly provided our team with a data of 47 outlets with 1-2 years (from Jan 2016 - Dec 2017) worth of data collected. The client wish to focus on the latest business fiscal year, henceforth we will be provided with the latest data obtained. The type of data obtained are summarize in the table below.

Data Summary

Delivery Data
A row in this table represents a specific delivery ordered by a branch. The detailed description of the main columns in this table are as follow:

Additional columns have also been created to aid us in our analysis as follow:

Promotion Data
A row in this table represent a specific promotional campaign held in a period. The detailed description of the main columns in this table are as follow:

Outlet Data
A row in this table represents the KOI branch outlet information. The detailed description of the main columns in this table are as follow:

EXPLORATORY DATA ANALYSIS

Our main project aim is to optimize inventory reorder point with a 15-20% safety net surplus. Utilising EDA, we will identify differences between orders from 2016 and 2017, taking into factors such as launch of new outlets. Additionally, we will analyze reordering frequency per outlets to identify outlets with the most reorders in terms of quantity and number of orders. Henceforth, we would perform a cluster analysis on products with similar trends as well as be able to provide a business recommendation on outlets with similar reordering frequncy in similar regions.

DATA CLEANING

Our group have identified a number of redundant columns in Delivery Data which were deemed not important in our analysis, and have subsequently removed it.
These columns are Time (The time where the order is being placed by the staff) and Type (The different type of order being placed).

Next, there were presence of several overseas and closed branches in the delivery data as well, such as KOI TE (Thailand), Cambodia Karanak KOI Café, Jakarta Koi, KOI Myanmar, Vietnam and closed outlets which include Espalande, close date of May 2017, and these were deemed as data noises. In addition to removing these data,we removed year 2015 and 2018 data as well, as our analysis on 2 years analysis - 2016 and 2017.

MODEL SELECTION

Next, seasonality analysis will be carry out to determine the optimal restock amount for each day by utilizing the past data. We will be utilizing Quantitative Forecasting Analysis to help identify optimal restock amount. This forecasting approach involves the use of historical data to predict future demand for goods. However, it is notable that the more data is available, the more accurate picture of historical demand will be attained. Furthermore, though this model provides a basis of forecasting, demand could be affected by seasonality. Henceforth, our team are considering two Time Series Analysis Models Naive Approach and Seasonal Naive Approach. In our opinion, Seasonal Naive Approach is a more appropriate model for our analysis, however due to the limited data provided, our team are considering the more generic Naive Approach too for our data analysis to ensure that we take into account the full picture available.

MODEL VALIDATION

To validate our model, we will be separating our data into two different set. 70% of the data will be used as our training data set and 30% of the data will be our test data to validate our model. Upon validating our model, we will modify our model and validate it again. This process will repeat until we are satisfied with the performance of our model.

@@ Line 79: / Line 79: @@
 ==<div style="background: #40403E; line-height: 0.3em; font-family:helvetica;  border-left: #FFA500 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF"><strong>DATA CLEANING</strong></font></div></div>==
-Missing values and outliers observed during our exploratory data analysis may result in unnecessary inaccuracy and result in skewness of our analysis. To handle missing value, we will look at the missing values / blanks identified and determine if the value should be replaced with our estimation, or simply removed. For outliers, our team will attempt to analyze the underlying reason behind such occurrence and decide if it is important and significant enough to be included in our analysis.
+Our group have identified a number of redundant columns in Delivery Data which were deemed not important in our analysis, and have subsequently removed it. <br>
+These columns are Time (The time where the order is being placed by the staff) and Type (The different type of order being placed).<br><br>
+Next, there were presence of several overseas and closed branches in the delivery data as well, such as KOI TE (Thailand), Cambodia Karanak KOI Café, Jakarta Koi, KOI Myanmar, Vietnam and closed outlets which include Espalande, close date of May 2017, and these were deemed as data noises. In addition to removing these data,we removed year 2015 and 2018 data as well, as our analysis on 2 years analysis - 2016 and 2017.
 <br>
 <!--
@@ Line 87: / Line 89: @@
 <br>
 -->
 ==<div style="background: #40403E; line-height: 0.3em; font-family:helvetica;  border-left: #FFA500 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF"><strong>MODEL SELECTION</strong></font></div></div>==
 Next, seasonality analysis will be carry out to determine the optimal restock amount for each day by utilizing the past data. We will be utilizing Quantitative Forecasting Analysis to help identify optimal restock amount. This forecasting approach involves the use of historical data to predict future demand for goods. However, it is notable that the more data is available, the more accurate picture of historical demand will be attained. Furthermore, though this model provides a basis of forecasting, demand could be affected by seasonality. Henceforth, our team are considering two Time Series Analysis Models Naive Approach and Seasonal Naive Approach. In our opinion, Seasonal Naive Approach is a more appropriate model for our analysis, however due to the limited data provided, our team are considering the more generic Naive Approach too for our data analysis to ensure that we take into account the full picture available.

Difference between revisions of "ANLY482 AY2017-18 Group9: Project Overview/ Methodology"

Revision as of 16:45, 25 February 2018

Contents

DATA COLLECTION / PREPARATION

EXPLORATORY DATA ANALYSIS

DATA CLEANING

MODEL SELECTION

MODEL VALIDATION

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools