Difference between revisions of "ANLY482 AY2017-18T2 Group06 Project Overview"
Line 29: | Line 29: | ||
<!--Body Start--> | <!--Body Start--> | ||
| | ||
+ | <br> | ||
+ | Proprietary trading has long relied on computers to help automate and execute trades. Data scientists, or more commonly known as Quants by Wall Street, have developed huge statistical models for the purpose of this automation. These models though complex, are somewhat static and as the market changes, a commonality in finance markets, they do not work as well as they do in the past. <br> | ||
+ | <br> | ||
+ | |||
+ | Technology advanced and we enter an era of Artificial Intelligence and Machine Learning. Systems have capabilities to analyse large amounts of data at speed and improve themselves through the process. This evolutionary computation and deep learning is seen to be able to automatically recognise changes in the market and adapt in ways the previous statistical models fail to do so. | ||
+ | <br> | ||
+ | <br> | ||
<div>__TOC__</div> | <div>__TOC__</div> | ||
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">SPONSOR BACKGROUND</font></div></div>== | ==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">SPONSOR BACKGROUND</font></div></div>== | ||
Line 40: | Line 47: | ||
| | ||
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>== | ==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>== | ||
+ | |||
===<div style="font-family:'Century Gothic';">Exploratory Segment</div>=== | ===<div style="font-family:'Century Gothic';">Exploratory Segment</div>=== | ||
− | + | <p style="padding-left: 1cm;"> | |
− | < | ||
<b>1. Data Collection</b> <br> | <b>1. Data Collection</b> <br> | ||
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on. <br> | At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on. <br> | ||
Line 51: | Line 58: | ||
<b>3. Initial Data Exploration</b> <br> | <b>3. Initial Data Exploration</b> <br> | ||
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done. | In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done. | ||
− | </ | + | </p> |
===<div style="font-family:'Century Gothic';">Iterative Segment</div>=== | ===<div style="font-family:'Century Gothic';">Iterative Segment</div>=== | ||
− | + | <p style="padding-left: 1cm;"> | |
− | < | ||
<b>4. Model Building</b> <br> | <b>4. Model Building</b> <br> | ||
Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objective by our client. <br> | Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objective by our client. <br> | ||
Line 62: | Line 68: | ||
We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues. <br> | We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues. <br> | ||
<br> | <br> | ||
− | We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory. | + | We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory. <br> |
− | </div> | + | </p> |
+ | ===<div style="font-family:'Century Gothic';">Actionable Segment</div>=== | ||
+ | <p style="padding-left: 1cm;"> | ||
+ | <b>6. Prediction/ Prescription</b> <br> | ||
+ | After the, modelling is completed, we intend to merge the model into our client’s existing system with brokerage system in a form of forward testing. The predictor will also be done as a real time prediction.<br> | ||
+ | </p> | ||
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>== | ==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic'; border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>== | ||
<!--Body End--> | <!--Body End--> |
Revision as of 17:14, 6 January 2018
|
|
|
|
|
|
|
Proprietary trading has long relied on computers to help automate and execute trades. Data scientists, or more commonly known as Quants by Wall Street, have developed huge statistical models for the purpose of this automation. These models though complex, are somewhat static and as the market changes, a commonality in finance markets, they do not work as well as they do in the past.
Technology advanced and we enter an era of Artificial Intelligence and Machine Learning. Systems have capabilities to analyse large amounts of data at speed and improve themselves through the process. This evolutionary computation and deep learning is seen to be able to automatically recognise changes in the market and adapt in ways the previous statistical models fail to do so.
SPONSOR BACKGROUND
MOTIVATION
OBJECTIVES
METHODOLOGY
Exploratory Segment
1. Data Collection
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling later on.
2. Data Cleaning + Transformation
In the data cleaning and transformation phase, the data will be tweaked into necessary statistical and analytics parameters necessary for prediction later.
3. Initial Data Exploration
In this area, the data will be initially explored and determining the approach of modelling to be taken based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables will be taken into consideration before modelling of the variables will be done. Due to the nature of our dataset, careful data exploration must be done.
Iterative Segment
4. Model Building
Creating model, determining predictor and target variables. In this area, we will be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objective by our client.
5. Model Validation
We will be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we will be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models so we will be aiming for a more rigorous testing process with a larger amount sample data to avoid such issues.
We will also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we will go back to phase 4 of model building or phase 2 to rebuild the model until the results is satisfactory.
Actionable Segment
6. Prediction/ Prescription
After the, modelling is completed, we intend to merge the model into our client’s existing system with brokerage system in a form of forward testing. The predictor will also be done as a real time prediction.