ANLY482 AY2016-17 T2 Group12 : Project Overview / Methodology

From Analytics Practicum
Jump to navigation Jump to search

Home

About Us

Project Overview

Findings

Project Management

Documentation

Other Group Projects

Description Methodology


Data

The dataset provided by KST Bikers is a Feedback System which consists of feedback lodged by:

  • SMS
  • Email
  • Feedback Form

TSK Transporters have also search for additional data source regarding public holiday as TSK Transporters maybe analysing how public holiday correlates with feedback volume. Listed below are three data sources corresponding to public holiday:

Tools Used

  • Microsoft Excel 2016
  • JMP Pro 13
  • D3.js

Methodology

Discovery
Our team will first understand what KST Bikers is all about through their website, annual reports, social media platforms and by asking our sponsor. Secondly, we will identify potential additional data sources that will help with our analysis. Lastly, we will research to find out what are some techniques or ideas on how to analyse feedback data. The following are some research that we have done and our key findings of each article:

S/N Title of Article Summary of Key Findings
1 Top tips on how to analyse feedback

Having a comprehension of how to use present and future state process mapping and the advantages of using data boxes, plus a visual workflow diagram are going to be essential in the most of the cases and will increase value to your data analysis. This provides a clear visual help in seeing where the bottlenecks are in your processing and areas where you have to made the improvements.

Other methods include cause and effect diagrams, like the fishbone technique with the 5 whys, which enable you to identify your root causes and will introduce you to your path of resolving your key critical areas.

Data analysis in the form of a chart will bring up some important areas for discussion, revisit and future strategy.

2 What is EDA? Exploratory data analysis (EDA) is not just a collection of techniques. It is a philosophy as to how we breakdown a data set; what to look out for; how we look; and how to interpret. Most EDA techniques are graphical with little quantitative techniques. There is heavy reliance on graphics as the main role of EDA is to open-mindedly explore.
3 Why You’re Not Getting Value from Your Data Science Business users keep coming up with problems and data analysts cannot keep up as they take much time build sophisticated data models. The most common problem is that data scientists often do not build their work around the final objective which is to derive business value. The following are the best practices:
  • Stick with simple models
  • Explore more business problems: Instead of exploring one business problem with a sophisticated business models. Build a simple model for each problem and assess the value proposition
  • Learn from a sample of data – not all the data
  • Focus on automation: Use algorithms and develop software systems to automate data processing techniques

Data Collection
The dataset is from KST Bikers’s internal database which is collected from a variety of sources such as email, SMS, mobile application, online feedback and call centre. Our team will also be using additional datasets such as weather and public holiday data. Having such data allows us to examine external factors which could impact the generation of feedbacks.

Data Cleaning
The dataset is from KST Biker's internal EFMS database which is collected from a variety of sources such as email, SMS, mobile application, online feedback and call centre. Our team had also included an additional dataset on public holiday data to aide us in our analysis.

Data Cleaning and Transformation
For this project, our team is conducting descriptive analysis and thus, there is not a need to remove any missing values, outliers or conduct any data normalization. However, a missing data pattern analysis will be done to find out if there are any missing values that could be filled up to aide our analysis. In addition, there is a need to ensure that the data for each variable is consistent and in a readable format.

Data Exploration
Our team first looked into the summary statistics of each variable to get an overview of the dataset. From there, we spot missing values and select key variables for analysis. We will then identify trends based on the top 10 categories of feedback. This will allow us to focus on the top few most important issues that Singaporeans faced. Furthermore, the team also did a control chart analysis to understand if there are any unusual data patterns occurring on a daily basis.

Dashboards
Initially, our team proposed to have two dashboards using Tableau. One to provide a summary of the trends and the other to show the different external factors that generate feedbacks. With the change in objectives, a dashboard will be created for KST Bikers to visualize the analysis using D3.js. It will help KST Bikers to do some form of data cleaning when they upload the data, and provide an overview of the trends in the feedback data. KST Bikers would be then able to view the breakdown of feedback volume by group, category, sub-category and time such as year, quarter or month. Our team will be using D3.js, an open source software, as it is able to build an interactive dashboard and no software installation will be needed.