Difference between revisions of "ANLY482 AY2017-18T2 Group06 Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
Line 41: Line 41:
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>==
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>==
 
===<div style="font-family:'Century Gothic';">Exploratory Segment</div>===
 
===<div style="font-family:'Century Gothic';">Exploratory Segment</div>===
 +
 
<div style="font-family:'Century Gothic';">
 
<div style="font-family:'Century Gothic';">
 
+
<b>1. Data Collection</b> <br>
<ol>
 
<b><li>Data Collection</li></b>
 
 
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on. <br>
 
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on. <br>
<b><li>Data Cleaning + Transformation</li></b>
+
<br>
 +
<b>2. Data Cleaning + Transformation</b> <br>
 
In the data cleaning and transformation phase, the data will be tweaked into necessary statistical and analytics parameters necessary for prediction later. <br>
 
In the data cleaning and transformation phase, the data will be tweaked into necessary statistical and analytics parameters necessary for prediction later. <br>
<b><li>Initial Data Exploration</li></b>
+
<br>
 +
<b>3. Initial Data Exploration</b> <br>
 
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done.
 
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done.
</ol>
+
</div>
 +
 
 +
===<div style="font-family:'Century Gothic';">Iterative Segment</div>===
 +
 
 +
<div style="font-family:'Century Gothic';">
 +
<b>4. Model Building</b> <br>
 +
Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to  machine learning algorithms to achieve the objective by our client. <br>
 +
<br>
 +
<b>5. Model Validation</b> <br>
 +
We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues. <br>
 +
<br>
 +
We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory.
 +
</div>
  
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>==
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>==
  
 
<!--Body End-->
 
<!--Body End-->

Revision as of 17:04, 6 January 2018

Logo.PNG

 

HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

MAIN PAGE

 

 

MOTIVATION

 

OBJECTIVES

 

METHODOLOGY

Exploratory Segment

1. Data Collection
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on.

2. Data Cleaning + Transformation
In the data cleaning and transformation phase, the data will be tweaked into necessary statistical and analytics parameters necessary for prediction later.

3. Initial Data Exploration
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done.

Iterative Segment

4. Model Building
Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objective by our client.

5. Model Validation
We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues.

We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory.

REFERENCES