ANLY482 AY1516 G1 Team Skulptors - Data Analysis

From Analytics Practicum
Revision as of 21:33, 27 February 2016 by Siying.tan.2012 (talk | contribs) (Created page with "<!--Main Navigation--> <center> 160px|link=ANLY482 AY1516 G1 Team Skulptors {|style="background-color:#ffffff; color:#000000; width="100%" cellspa...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Skulptors-Logo.png

Skulptors-HomeIcon.png   HOME Skulptors-AboutIcon.png   ABOUT US Skulptors-OverviewIcon.png   PROJECT OVERVIEW Skulptors-DataIcon.png   DATA ANALYSIS Skulptors-ProjMgmtIcon.png   PROJECT MANAGEMENT Skulptors-DocIcon.png   DOCUMENTATION
Data Cleaning Inbound EDA Outbound EDA


Before embarking on the Exploratory Data Analysis (EDA) stage, the team took time to understand the data they were handling and also remove irrelevant data that should not be included in the analysis. The data cleaning change log can be seen below:

Data Cleaning Log
S/NDateData Cleaning StepsRationale & Justification
0106 JanRemoved data with repetition.Data repetition is commonly due to accidental double scanning in the warehouse.
0206 JanRemoved rows with missing data.Rows with incomplete data are excluded as they have no value.
0306 JanRemoved data rows which exceeded time frame.As our project focuses on the inflow rate and outflow rate of the SKUs movement for a year (Jan 16 – Dec 16), data sets which exceeded this time frame were removed.
0423 JanRemoved data rows with transcode “IRT” for inbound dataset.Sponsors felt that rows with transcode “IRT” were not valuable to them as they represented failed inbound transactions.
0523 JanRemoved data rows with transcode “OCP,ORP,ORV,OSA, OSD” for outbound dataset.Sponsors felt that rows with transcode “OCP, ORP, ORV, OSA, OSD” were not valuable to them as they represented failed outbound transactions.
0623 JanAmendments to outbound dataset.Team found out that there were some discrepancy in the outbound data. There were two outbound excel data sets from (Jan - July) and (Aug - Dec). However, (Jan - July)’s excel has some datasets with transaction records of in the month of Aug, Sep and Oct. These dataset were unique from those in the (Aug - Dec)’s data. Hence, the team ported over the datasets which had transactions from Aug, Sep and Oct from the (Jan - July)’s excel to (Aug - Dec)’s excel upon clarification with sponsors.
0705 FebRe-added data rows with transcode “IRT” for inbound dataset.Upon advice from Prof. This will provide sponsors with the option to analyze failed inbound transactions if they wish to, in the future.
0805 FebRe-added data rows with transcode “OCP,ORP,ORV,OSA, OSD” for outbound dataset.Upon advice from Prof. This will provide sponsors with the option to analyze failed outbound transactions if they wish to, in the future.
0905 FebMerging of two separate excel files into a single csv file.Upon advice from Prof. To eliminate the need for working on two separate datasets. Improves efficiency of analyzing as there is only one single csv file to work on.
1018 FebAdded a column into existing dataset to represent each SKU “A, B, C” classification.Upon advice from Prof. This will allow team to further analyze and find out how the “A, B, C” classification of each SKU correlates with it outbound rate and put away location.