ANLY482 AY2017-18T2 Group06 Project Overview

From Analytics Practicum
Jump to navigation Jump to search
Logo.PNG

 

HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

MAIN PAGE

 

 

MOTIVATION

 

OBJECTIVES

 

METHODOLOGY

Exploratory Segment

1. Data Collection
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on.

2. Data Cleaning + Transformation
In the data cleaning and transformation phase, the data will be tweaked into necessary statistical and analytics parameters necessary for prediction later.

3. Initial Data Exploration
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done.

Iterative Segment

4. Model Building
Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objective by our client.

5. Model Validation
We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues.

We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory.

REFERENCES