Difference between revisions of "ANLY482 AY2017-18 T1 Group2 Project EZLin Project Data"

From Analytics Practicum
Jump to navigation Jump to search
Line 57: Line 57:
 
=== Variable Transformation ===
 
=== Variable Transformation ===
 
<div style="font-size: 16px;">
 
<div style="font-size: 16px;">
The excel sheets are lack of a standard format and they are generated by different people. Same variable could have different names in different excel sheet. As such, we will need to standardize the format of all the sheets such as standardize the naming format of the variables before performing any analysis.  
+
The excel sheets lack a standard format and they are generated by different people. Same variable could have different names in different excel sheet. As such, we will need to standardise the format of all the sheets such as standardising the naming format of the variables before performing any analysis.  
 
</div>
 
</div>
  
 
=== Multi Entry Data ===
 
=== Multi Entry Data ===
 
<div style="font-size: 16px;">
 
<div style="font-size: 16px;">
We identified that some excel sheets include multi entry of the same product, with different variable values. <br/>
+
We identified that some excel sheets include multiple entry of the same product, with different variable values. <br/>
 
This occurs due to various issues such as human errors, or generating reports in new data format using the old systems in the company. After talking to the sponsors, it is believed that the multi entry of the data need to be combined before we start working on any analysis.
 
This occurs due to various issues such as human errors, or generating reports in new data format using the old systems in the company. After talking to the sponsors, it is believed that the multi entry of the data need to be combined before we start working on any analysis.
 
</div>
 
</div>

Revision as of 21:37, 27 August 2017


HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 

 


The dataset is provided by XXX’s supply chain and data team in excel format. The timeframe of the data being provided is for the first few months in 2017.

Data Preparation

Mapping of Data

The initial 14 excels sheets have different format and include various information. There are sheets on the exchange rate between different countries, BOM of the plants and markets in various APAC countries, and etc. In order to identify:
1) the flow of manufacturing and distributing,
2) the location of the manufacturing plant and markets and
3) the cost spending on raw materials and overhead,
we need to map all these sheets together, through identifying primary and secondary key in each sheet before linking all the sheets together.

Variable Transformation

The excel sheets lack a standard format and they are generated by different people. Same variable could have different names in different excel sheet. As such, we will need to standardise the format of all the sheets such as standardising the naming format of the variables before performing any analysis.

Multi Entry Data

We identified that some excel sheets include multiple entry of the same product, with different variable values.
This occurs due to various issues such as human errors, or generating reports in new data format using the old systems in the company. After talking to the sponsors, it is believed that the multi entry of the data need to be combined before we start working on any analysis.


Data Visualization


EZLin Comingsoon.png