AY1516 T2 Team Skulptors - Methodology & Technology

From Analytics Practicum
Jump to navigation Jump to search

Skulptors-Logo.png

Skulptors-HomeIcon.png   HOME Skulptors-AboutIcon.png   ABOUT US Skulptors-OverviewIcon.png   PROJECT OVERVIEW Skulptors-ProjMgmtIcon.png   PROJECT MANAGEMENT Skulptors-DocIcon.png   DOCUMENTATION
Summary Description Methodology & Technology Limitations & ROI


Methodology


The following methodology will be implemented to perform analysis on the past year data provided by our sponsor for the two companies. Before we begin with our analysis, the data needs to be thoroughly examined and prepared to ensure there is no possibility of garbage-in-garbage-out.

Data Preparation

Before going into elementary explorations of the data, data cleaning is to be conducted to remove irrelevant data that should not be included in our analysis. Some types of reason for exclusion includes:

  • Repetition – Double scanning happens many of a times in the company’s warehouse. As such, there can be repetition of rows in the past year data that may falsify our findings.
  • Missing data – if there are any rows of incomplete data, the row will be excluded.
  • Over-spilling – As our project focuses on inflow rate and outflow rate of the SKUs movement, we will exclude data there does not account for the start of the SKUs to the end of it.

After the initial preparation, we will further develop and sort our data by investigating for potential groupings that can facilitate ease into performing our visualizations and analysis. This will also assist us in the development of our regression model to figure out what type of classification, A, B or C should we give to a specific SKU.

Control Chart
Skulptors-ControlSample.jpg
The control chart is a graph used to study how a process changes over time. Data are plotted in time order. A control chart always has a central line for the average, an upper line for the upper control limit and a lower line for the lower control limit. These lines are determined from historical data. By comparing current data to these lines, we can draw conclusions about whether the process variation is consistent (in control) or is unpredictable (out of control, affected by special causes of variation).
With respect to our project, these control charts will provide us with insights on whether certain movements of SKUs are out of control e.g. too little movements which results in wastage of space in the warehouse. It also simplifies the performance to be easily read by the human eye. Our sponsor emphasizes on sustainability of our solution, which will be further tackled by this in our dashboard.

Time series line chart
Skulptors-TimeseriesSample.jpg
A time series is a sequence of data points, typically consisting of successive measurements made over a time interval. With relation to the inbound and outbound rate visualization for different SKUs, the time series line graph will cater to this purpose.

Treemap
Skulptors-TreemapSample.jpg
A Treemap can convey our hierarchical data with 2 additional attributes via color and size, allowing us to dissect the relationship between the two. The size of each block represents the percentage of the warehouse utilized, and we can allow the user to fill in the colors with other indicators, such as the type of SKUs.


Technology


D3.js

D3.js is a Javascript library. As our client request for as low cost as possible, D3.js will be a good option. That is because it can work on websites. Thus, our client will be able to see the data visualization without paying and installing any software. Another benefit of D3.js is its flexibility. It allows control over the final result. D3.js will be used for the visualizations that was mentioned above.

JMP

JMP is developed by the JMP business unit of SAS Institute. It is the tool of choice for data explorers in every industry. We will be using JMP to perform Control Chart Analysis for the inbound and outbound rate so that we can classify the SKUs into the ABC.

SAS Enterprise Miner

SAS Enterprise Miner is a software developed by SAS Institute. We may be using SAS Enterprise Miner to perform other needed analysis.

Java & Bootstrap

Java will be used to develop the skeleton of the application while Bootstrap will be used to beautify the application. As our client wanted a sustainable analysis, we will need to do up a simple application whereby they can upload their data to be processed for analysis.

OpenShift

OpenShift is a product from Red Hat. It is an open-source platform as a service. It serves as a platform to launch our application for live deployment.