ANLY482 AY2017-18 T1 Group2 Project EZLin Project Overview

From Analytics Practicum
Jump to navigation Jump to search


HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 

 



Introduction


The supply chain network forms the backbone in the movement of any goods or services for any organisation with a tangible product. The ability to accurately identify the cost incurred at each stage of the network could potentially create substantial benefits in terms of cost savings and efficiencies for the company. Given that most organisations always regard the supply chain network as a cost centre, not much has been done in fully leveraging on the power of analytics to derive insights that is able to contribute to the profitability of the company. In addition, the raw data that is stored in the internal database of most companies often require substantial cleaning and transformation even before any insight is able to be derived from it.

For JnJ, the raw data that includes information such as the raw material cost and the overhead cost are being extracted from their internal SAP system. Given that this extraction is done on an ad-hoc basis, much effort and time is required to clean the data so as to be able to generate the required reports. Furthermore, due to the inconsistency in the understanding of data and the different users in different geographical region, it results in the quality issues and inconsistency. As such, this tedious and error-prone methodology of extracting quality information in order to derive insights from it could potentially compromise the accuracy of the report generated. Therefore, there is a need to substantially clean and understand the data even before any insights can be derived from the dataset.

For this practicum, Python was the primary language used in the processing and transformation of the original raw dataset in order to meet the requirements of the commerce side. The reason for choosing Python as the primary language in our data processing is because of the ease in the language usage and the wide extensive library packages for transformation (e.g. Pandas). Nevertheless, other languages such as R which is well-regarded for its built-in statistical and analytical tools was also considered. However, the final decision to use Python was also due to the team familiarity with the language.



Motivation


Cost optimization of the supply chain is a challenge for many companies, as the whole process of supply chain is a complex network. Between each point on this network, there are different cost components for different transactions. Currently, XXX’s analysis only focuses on separate parts of this whole supply chain network and look at them in silos, preventing them from seeing the whole picture. As such, our motivation is to use different analysis methods and visualization tools to map the end-to-end business process. Given the complex structure of the company’s supply chain network, we are keen to explore ways to map the flow from the plant to the final distribution centre. Through this, we are hoping that we would be able to discover what are the cost implications for different parts of the supply chain and how to potentially improve this whole network.


Objectives


The main aim of our project is to help XXX’s supply chain team explore any trends and patterns in their current supply chain spending when they produce their adult wash product. Through this trends and patterns identified, it is hoped that we will be able to ultimately help them improve their end-to-end supply chain understanding as well as automate their data extraction for future supply chain analysis. Based on the patterns identified, a dashboard reporting system will also be developed simultaneously in this project to provide a visual interface for their future usage. The objectives of this project are:
1. To summarise the information of the materials prices based on different criteria and condition type 2. To clearly map the process of the supply chain from the internal manufacturer to the final distribution centre with the transaction cost incurred at each point 3. Build a data automation process that can help to clean and transform the raw data required for visualisation 4. To identify the clustering factors and determine their effects on each other 5. Establish a dashboard reporting system which helps with data exploration and visualisation of the supply chain flow in the future


Literature Research


To fully understand the importance and use of analytics tools in driving a supply chain network, preliminary assessment into data driven supply chain network was explored.

The use of data to understand the supply chain flow is not new in the industry and much has been done in leverage on Big Data Analytics (BDA) to do so. According to Arunachalam, Kumar and Kawalek (2017), the entire process of transforming raw data into useful insights until recently has been known as Business Intelligence and is used interchangeably with many terms such as Big Data Analytics or Business Analytics. However, it is the use of the data to help understand the supply chain network and derive insights that make the data truly useful. In order to truly leverage on the power of the data, companies have to drive the data-centric culture into the business decision making process (Arunachalam et. al, 2017).

In our literature review, the use of Python in automating the data cleaning process and driving the supply chain efficiencies are lesser in comparison to other research studies. Nevertheless, the usage of visualisation in deriving insights and analysis has been heavily focused on. For instance, leveraging on Python, the use of Sankey diagram as a tool for visualisation has proven useful for multidimensional data.

Therefore, given the complexity of the dataset, we aim to design the entire process through substantial secondary research on forums and liaising with the Information Technology team from JnJ.