Difference between revisions of "Hiryuu Findings"

From Analytics Practicum
Jump to navigation Jump to search
Line 93: Line 93:
 
** With the exception for App 2's data  
 
** With the exception for App 2's data  
 
* Sender and Receiver Information should be included
 
* Sender and Receiver Information should be included
 +
 +
==<div style="background: #A4A4A4; line-height: 0.3em; font-family:Roboto;  border-left: #848484 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#ffffff"><strong>Detailed Data Preparation Steps </strong></font></div></div>==
 +
<h3> Exploratory Analysis </h3>
 +
# <h4> 1. Identifying inbound(IB) / outbound(OB) shipments </h4>
 +
Different methods are used for different data sets according to the specifications of our sponsor.
 +
<br>Methods used include:
 +
* Checking the receiver and sender name
 +
* Identifying specific customer account no for IB and OB shipments
 +
* Identifying IB and OB shipments use starting and ending stage codes
 +
 +
#<h4> 2. Calculating turnaround time (TAT) </h4>
 +
TAT time is used to calculate the total time taken from the starting point to the ending point of the shipment. It is used to flag out and ensure that shipments stay within the stipulated time.
 +
<br>Steps conducted:
 +
# Identify the starting and ending dates using the starting and ending stage codes and their relevant stage updated dates.
 +
# Calculate the TAT time by the difference in the number of days between the ending and starting date.
 +
For shipments that are missing either a starting or ending point, they are flagged out and labelled as "Incomplete" shipments.
 +
<br> Shipments that are complete, they will either by labelled as "Pass" or "Fail" depending on the calculated TAT time. If the TAT time <= N, it will be labelled as "Pass" else "Fail".
 +
 +
<h3> Geospatial Analysis </h3>

Revision as of 15:58, 21 February 2017

Current Project

Logo Hiryuu.png


Home

About Us

Project Overview

Findings

Project Management

Documentations

Data Preparation Analysis

Data Tools Used

Microsoft Excel

Used a VBA macro to merge all individual data files into a compilation Macro’s function was to open all csv files in a folder and add them into a master sheet by the rows

Pros:

  • Excel was the tool of choice by the sponsors, and it preserves the character encoding of non-western characters
  • Easy to use

Cons:

  • Limited in the number of rows it could handle

JMP Pro

Pros:

  • Useful in joining data tables with large number of data
  • Useful in tabulating data of different variables into a data table
  • Easy creation of visuals to look into the data

Cons:

  • Faced issues with non-western characters inconsistently that made it difficult to determine the cause
  • Difficulty in trying to use advanced functions that required scripting ability
  • No formula to find date difference for working days only

Objectives

  • Identifying the correct set of data
  • Identifying complete shipments
  • Identifying shipments as inbound or outbound
  • Ensuring no missing data
  • Identifying the relevant geospatial data for analysis
  • Clarify unclear data/abnormalities with Sponsors
    • E.g. Quantity > 1

General Steps Taken

  1. 1. Joining data tables for shipment details and status update

  • Used left outer join with statuses as the main table so as to add on the corresponding shipment details to each status update
  • Used Match Flag to check if there is missing data from either source
  1. 2. Joined the data table with the status codes list

  • Status data only contained the stage code without explanation
  • Needed to join with another data table with the stage codes and relevant descriptions
  1. 3. Removing Duplicate Rows

  • Duplicate rows with identical data were present in the data given to us
  • Needed to delete duplicates to ensure greater data accuracy for our analysis
  • Duplicated data rows serves no practical purpose hence ought to be removed
  • Used a JMP jsl script posted on the JMP community website to delete duplicate rows whilst preserving the first unique occurrence of it
  1. 4. Checking the data for any abnormalities based on standard parameters given by the sponsor

  • Basic guidelines put forth by the Sponsor includes:
  • Quantity = 1
  • All should be domestic shipments
    • With the exception for App 2's data
  • Sender and Receiver Information should be included

Detailed Data Preparation Steps

Exploratory Analysis

  1. 1. Identifying inbound(IB) / outbound(OB) shipments

Different methods are used for different data sets according to the specifications of our sponsor.
Methods used include:

  • Checking the receiver and sender name
  • Identifying specific customer account no for IB and OB shipments
  • Identifying IB and OB shipments use starting and ending stage codes
  1. 2. Calculating turnaround time (TAT)

TAT time is used to calculate the total time taken from the starting point to the ending point of the shipment. It is used to flag out and ensure that shipments stay within the stipulated time.
Steps conducted:

  1. Identify the starting and ending dates using the starting and ending stage codes and their relevant stage updated dates.
  2. Calculate the TAT time by the difference in the number of days between the ending and starting date.

For shipments that are missing either a starting or ending point, they are flagged out and labelled as "Incomplete" shipments.
Shipments that are complete, they will either by labelled as "Pass" or "Fail" depending on the calculated TAT time. If the TAT time <= N, it will be labelled as "Pass" else "Fail".

Geospatial Analysis