Difference between revisions of "ANLY482 AY2017-18T2 Group27 : Project Overview / Methodology"

From Analytics Practicum
Jump to navigation Jump to search
(Created page with "__NOTOC__ {|style="padding: 5px 0 0 0;" width="100%" cellspacing="0" cellpadding="0" valign="top"| | style="background-color:#ffcd00; text-align:center;" width="14%" | ANLY...")
 
 
(4 intermediate revisions by 3 users not shown)
Line 25: Line 25:
 
|}
 
|}
  
==<div style="background: #c9c9c9; padding: 15px; font-weight: bold; line-height: 0.3em; text-indent: 0px; font-size: 16px"><font color=#292929 >7.0 Methodology</font></div>==
+
==<div style="background: #c9c9c9; padding: 15px; font-weight: bold; line-height: 0.3em; text-indent: 0px; font-size: 14px"><font color=#292929 >7.0 Methodology</font></div>==
 +
<div style="padding-left:0px; padding-right:0px; text-align: justify; font-size:13px">
 
==== 7.1 Tools Used ====
 
==== 7.1 Tools Used ====
In this project, 2 main tools will be used - Power BI and Python.  
+
In this project, 3 main tools will be used - Power BI, Excel and JMP.  
  
Power BI is the choice of tool by DHL and data visualisation will be done on this medium. Also, with the large dataset, Python will be used to build regression model.
+
Power BI is the choice of tool by Company X and data visualisation will be done on this medium.
  
==== 7.2 Project Methodology ====
+
Excel is used by Company X to store their data, taken from their system. It is also used for part of the data cleaning process, namely: categorizing density of each shipment into their respective freight density ratios and appending new data sets given to us.
  
Since we have not obtained the data yet, we were unable to come up with a comprehensive project methodology.  
+
JMP is used for part of the data cleaning process too, namely: removing rows with bad data and duplicates, and recoding of data fields.
  
Our brief plan of action includes doing descriptive analytics and predictive analytics. Descriptive analytics will be done via doing data visualisation. We will be combining various descriptive analysis of operational factors into a dashboard which will allow DHL to monitor, track and diagnose. Additionally, a regression model to allow DHL to better forecast shipment behaviour. We will also be doing secondary research from various journal articles to potentially supplement our project.
+
==== 7.2 Data Cleaning and Preparation ====
  
As of now, we will be focusing on understanding and cleaning the data.
+
Since data cleaning was not the focus of Company X, we did basic cleaning. This includes:
 +
 
 +
* Removing Outliers
 +
* Removed Duplicates
 +
* Standardising Format of Data
 +
*Transforming Relevant Variables
 +
 
 +
</div>

Latest revision as of 16:15, 16 April 2018

Homepage

Our Team

Project Overview

Project Findings

Project Management

Documentation

ANLY482 AY2017-18 T2 Projects

Description Data Methodology

7.0 Methodology

7.1 Tools Used

In this project, 3 main tools will be used - Power BI, Excel and JMP.

Power BI is the choice of tool by Company X and data visualisation will be done on this medium.

Excel is used by Company X to store their data, taken from their system. It is also used for part of the data cleaning process, namely: categorizing density of each shipment into their respective freight density ratios and appending new data sets given to us.

JMP is used for part of the data cleaning process too, namely: removing rows with bad data and duplicates, and recoding of data fields.

7.2 Data Cleaning and Preparation

Since data cleaning was not the focus of Company X, we did basic cleaning. This includes:

  • Removing Outliers
  • Removed Duplicates
  • Standardising Format of Data
  • Transforming Relevant Variables