Difference between revisions of "ANLY482 AY2017-18T2 Group06 Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 41: Line 41:
 
 
 
 
  
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">OBJECTIVES</font></div></div>==
+
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">MOTIVATION &OBJECTIVES</font></div></div>==
  
 +
The price movements of foreign exchange rate currency pairs have always been an instrument of focus by financial institutions and investors.
  
Utilising the minute tick data from our sponsor, we would like to discover useful and practical insights which will allow traders to make more informed decisions in their trading. We would be coming up with a predictive modelling for currency pair.  
+
Currently, pH7 views technical analysis models through their brokerage provided dashboards which do not deliver any combined analysis across more than one technical analysis model or provide any form of suggested trading action they should take. They expressed an interest in using Bollinger Bands together with Relative Strength Index (RSI) to better understand the price movement patterns.
  
The team and our sponsor pH7 Global have identified 2 areas of focus for this project:
+
Therefore, we intend to use technical analysis-Bollinger Bands, RSI and Time Series Forecasting- ARIMA method to analyze price movements and provide a form of trading action which they could adopt. Our objective is to develop a simple and yet useful R-Markdown file that our sponsor would be able to edit and deploy to generate insights for his future trade executions.
  
1. Preliminary Data Analysis and Information Research
+
With our methodologies used to deduce these insights, this would allow them to forecast future trends and behaviors in the financial markets.
<br>
 
2. Predictive Algorithm Modeling and Strategy Testing
 
  
At the end of the project, the teams aims to design a unique predictive model from the data insights discovered during the analysis.
 
 
&nbsp;
 
&nbsp;
  
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>==
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">METHODOLOGY</font></div></div>==
Our methodology will be a 5 step approach to data prediction, explanation modelling for USD/JPY 1 minute chart.
+
Our methodology will be a 5-step approach for the analysis on the time series data for foreign exchange currency pairs.
 
<br>
 
<br>
 
===<div style="font-family:'Century Gothic';">Exploratory Segment</div>===
 
===<div style="font-family:'Century Gothic';">Exploratory Segment</div>===
Line 63: Line 61:
 
<br>
 
<br>
 
<b>2. Data Cleaning + Transformation</b> <br>
 
<b>2. Data Cleaning + Transformation</b> <br>
In the data cleaning and transformation phase, the data would be tweaked into necessary statistical and analytics parameters necessary for prediction later. <br>
+
In the data cleaning and transformation phase, the data would be tweaked into necessary statistical and analytics parameters necessary for running analysis models later. <br>
 
<br>
 
<br>
 
<b>3. Initial Data Exploration</b> <br>
 
<b>3. Initial Data Exploration</b> <br>
In this area, the data would be initially explored and we would determine the approach of modelling based on the nature of the dataset. Necessary preparations such as checking for multicollinearity of the variables would be taken into consideration before modelling of the variables would be done. Due to the nature of our dataset, careful data exploration must be done.
+
In this area, the data would be initially explored, and we would determine the approach of analysis model based on the nature of the dataset. The nature of our dataset focuses on time series and price related movements, careful data exploration must be done to understand the best tools to use.
  
 
===<div style="font-family:'Century Gothic';">Iterative Segment</div>===
 
===<div style="font-family:'Century Gothic';">Iterative Segment</div>===
<b>4. Model Building</b> <br>
+
<b>4. Selecting and Deploying the Analysis Model</b> <br>
Creating model, determining predictor and target variables. In this area, we would be experimenting with multiple different approaches based on our initial understanding of the dataset after the exploration. It could range from visualizations to machine learning algorithms to achieve the objectives of our client. <br>
+
In this area, we would be experimenting with multiple different analysis approaches based on our initial understanding of the dataset after the exploration. It could range from forecasting to technical analysis, discovering seasonal trends and visualizations to uncover time series patterns to achieve the objectives of our client. <br>
 
<br>
 
<br>
 
<b>5. Model Validation</b> <br>
 
<b>5. Model Validation</b> <br>
We would be proposing a multi-variate methodology of sampling data in order to validate our model. In this aspect, we would be using the 3 way of approach of model validation called “train, test and validate”. Due to the nature of the project, we would like to avoid overfitting and bias in our models. Hence, we will be aiming for a more rigorous testing process with a larger sample data size to avoid such issues. <br>
+
We would be proposing a multi-variate methodology of sampling data to validate our analysis model. In this aspect, we would be using the 2-way of approach of model validation called “train and test”. <br>
 
<br>
 
<br>
We would also be using benchmark metrics to test our predictive modelling to ensure that it is satisfactory. Should it not be satisfactory, we would go back to phase 4 of model building or phase 2 to rebuild the model till the results is satisfactory. <br>
+
We would also be using benchmark metrics to test our analysis models to ensure that it is satisfactory. Should it not be satisfactory, we would go back to phase 4 of model building or phase 2 to rebuild the model till the results is satisfactory. <br>
  
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">DATA</font></div></div>==
+
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>==
 
+
AS, B., & SK, R. (2015). Exchange Rate Forecasting using ARIMA, Neural Network and Fuzzy Neuron. Retrieved from https://pdfs.semanticscholar.org/c229/b2436364db18b9fb51cd2974b1b4d6766f02.pdf.<BR>
===<div style="font-family:'Century Gothic';">Data Source</div>===
 
<p style="padding-left: 1cm;">
 
 
 
The dataset given to us includes multiple timeframes of the same period of time series data for a 2 years’ time period; 1st July 2015 to 30th June 2017.<BR>
 
<BR>
 
The data fields include:<BR>
 
- Timestamp (timestamp of the data)
 
<BR>
 
- High (High point of the currency pair for the minute)
 
<BR>
 
- Low (Low point of the currency pair for the minute)
 
<BR>
 
- Open (open price of the currency pair for the minute)
 
<BR>
 
- Close (closing price of the currency pair for the minute)
 
 
<BR>
 
<BR>
[[Image:Data1.PNG|centre|600px|]]
+
B. (2017). Monetary Policy. Retrieved from https://www.boj.or.jp/en/mopo/mpmdeci/mpr_2017/index.htm/<BR>
 
 
To access our client’s database, we used Rstudio codes to directly access the AWS servers and retrieve the data as needed for our analysis. This gave us the flexibility of choosing time periods we want to work with for our analysis.
 
 
 
The resulting data retrieval for 2 years worth of minute tick data for 1 currency pairs comes close to 750,000 rows.
 
 
 
[[Image:Data2.PNG|centre|600px|]]
 
 
 
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">SCOPE OF WORK</font></div></div>==
 
We intend to adopt the following steps in our analysis:<BR>
 
 
<BR>
 
<BR>
• Discover insights within the provided data
+
BAASHER, A. A., & FAKHR, M. W. (n.d.). FOREX Trend Classification using Machine Learning Techniques. Retrieved from https://pdfs.semanticscholar.org/3c2f/cbcb9bdc0205e924c0f2518d01864da8979a.pdf <BR>
 
<BR>
 
<BR>
• To collect and ensure the data of currency pair is relevant in modelling
+
Balsara, N. J., Chen, G., & Zheng, L. (2007). The Chinese stock market: An examination of the random walk model and technical trading rules. Quarterly Journal of Business & Economics, 46(2), 43–63. <BR>
 
<BR>
 
<BR>
• Ensure accuracy of data by checking for multicollinearity during data exploration stage 
+
Brewer, M. J., Butler, A., & Cooksley, S. L. (n.d.). The Relative Performance of AIC, AICC and BIC in the Presence of Unobserved Heterogeneity. Retrieved from https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12541.<BR>
 
<BR>
 
<BR>
• Identification of approaches that range from visualization to machine learning algorithms to determine predictor and target variables
+
Butler, M., & Kazakov, D. (2010). Particle swarm optimization of Bollinger Bands. In Swarm Intelligence (pp. 504–511), Springer, Berlin.<BR>
 
<BR>
 
<BR>
• Validate model through “train, test and validate”
+
Jebb, A. T., Tay, L., Wang, W., & Huang, Q. (2015). Time series analysis for psychological research: Examining and forecasting change. <BR>
 
<BR>
 
<BR>
• Use a large sample data to prevent overfitting and bias in our model
+
J Hyndman, R. (n.d.). ARIMA modelling in R. Retrieved from https://www.otexts.org/fpp/8/7 <BR>
 
<BR>
 
<BR>
• Design a unique predictive model
+
Kamruzzamana, J. and Sarkerb, R. A. (2003). Comparing ANN Based Models with ARIMA for Prediction of Forex Rates . Retrieved from https://pdfs.semanticscholar.org/959e/dc19a0dfdc94464ac7d6d1f0e2927000d565.pdf <BR>
 
<BR>
 
<BR>
• Utilisation of benchmark metrics to test the success rate of the predictive model
+
Kiiski, J. (2009). PERFORMANCE OF RSI INVESTMENT STRATEGY ON FOREIGN EXCHANGE MARKETS. Retrieved from https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12541. <BR>
 +
<BR>
 +
Kuepper, J. (n.d.). Technical Analysis: Indicators And Oscillators. Retrieved from https://www.investopedia.com/university/technical/techanalysis10.asp#ixzz5B2AU2GDa <BR>
 
<BR>
 
<BR>
• It is important to note that the scope of the project is versatile and can be furthered to address additional questions pH7 might have on the dataset
+
Nau, R. (2017, December 14). Identifying the numbers of AR or MA terms in an ARIMA model. Retrieved from https://people.duke.edu/~rnau/411home.htm <BR>
 
<BR>
 
<BR>
 
+
Petrusheva, N., & Jordanoski, I. (2016). COMPARATIVE ANALYSIS BETWEEN THE FUNDAMENTAL AND TECHNICAL ANALYSIS OF STOCKS. Retrieved from http://scindeks-clanci.ceon.rs/data/pdf/2334-735X/2016/2334-735X1602026P.pdf <BR>
==<div style="background: #708090; line-height: 0.5em; font-family:'Century Gothic';  border-left: #2E5593 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF">REFERENCES</font></div></div>==
 
B. (n.d.). Dynamic Bayesian Networks. Retrieved November 3, 2010. Retrieved from https://bi.snu.ac.kr/Courses/g-ai10f/Ch9_DBN.pdf. <BR>
 
<BR>
 
Gray, A. (2017, March 9). The world’s 10 biggest economies in 2017. Retrieved from https://www.weforum.org/agenda/2017/03/worlds-biggest-economies-in-2017/<BR>
 
<BR>
 
PETRICĂ, A., STANCU, S., & TINDECHE, A. (2016). Limitation of ARIMA models in financial and monetary economics. Retrieved from http://store.ectap.ro/articole/1222.pdf<BR>
 
 
<BR>
 
<BR>
Rise of the billionaire robots: how algorithms have redefined hedge funds. (2016, May 15). Retrieved from https://www.theguardian.com/business/us-money-blog/2016/may/15/hedge-fund-managers-algorithms-robots-investment-tips<BR>
+
S. (2017). April 2017 Current Events: U.S. News. Retrieved from https://www.infoplease.com/world/2017-current-events/april-2017-current-events-us-news <BR>
 
<BR>
 
<BR>
Satariano, A., & Kumar, N. (2017, September 27). The Massive Hedge Fund Betting on AI. Retrieved from https://www.bloomberg.com/news/features/2017-09-27/the-massive-hedge-fund-betting-on-ai <BR>
+
Williams, O. D. (2006). Empirical Optimization of BBsFor Profitability. 1-72. Retrieved from file:///C:/Users/User/Downloads/etd2519 (1).pdf. <BR>
 
<BR>
 
<BR>
US, I. F. (2011). U.S. Dollar Index. Retrieved from https://www.theice.com/publicdocs/ICE_USDX_Brochure.pdf.
 
 
 
<!--Body End-->
 
<!--Body End-->

Latest revision as of 02:28, 14 April 2018

Logo.PNG

 

HOME

ABOUT US

PROJECT OVERVIEW

ANALYSIS & FINDINGS

PROJECT MANAGEMENT

DOCUMENTATION

MAIN PAGE


 
The ability to understand and visualize price movements of foreign exchange rates plays an important role in discovering insights for trading companies. Price movements based on market fundamentals are not sufficient to understand the irrational market behaviors in short time frames such as seconds or minutes movements. To address this problem, better models are required to give more insights to learn about the price patterns. This allows us to better understand the currency pair, US Dollar to Japanese Yen movement and to discover actionable insights based on the two techniques used in our paper: Technical Analysis and Time Series Forecasting.

Using the existing market data that pH7 has collected, this research study aims to share with you our journey through this research process to understand the currency price movements. The research study starts with an overview of the business and research motivations to understand the trends within the dollar yen in different time frames and time periods. Through our consolidated findings and the ARIMA model to forecast price movements for the USD/JPY, we hope to be able to find actionable insights to advise our client on a better approach to tackle this currency pair.




 

MOTIVATION &OBJECTIVES

The price movements of foreign exchange rate currency pairs have always been an instrument of focus by financial institutions and investors.

Currently, pH7 views technical analysis models through their brokerage provided dashboards which do not deliver any combined analysis across more than one technical analysis model or provide any form of suggested trading action they should take. They expressed an interest in using Bollinger Bands together with Relative Strength Index (RSI) to better understand the price movement patterns.

Therefore, we intend to use technical analysis-Bollinger Bands, RSI and Time Series Forecasting- ARIMA method to analyze price movements and provide a form of trading action which they could adopt. Our objective is to develop a simple and yet useful R-Markdown file that our sponsor would be able to edit and deploy to generate insights for his future trade executions.

With our methodologies used to deduce these insights, this would allow them to forecast future trends and behaviors in the financial markets.

 

METHODOLOGY

Our methodology will be a 5-step approach for the analysis on the time series data for foreign exchange currency pairs.

Exploratory Segment

1. Data Collection
At the initial phases of data collection, we must ensure that we have the sufficient fields that are needed for modelling in the later stage.

2. Data Cleaning + Transformation
In the data cleaning and transformation phase, the data would be tweaked into necessary statistical and analytics parameters necessary for running analysis models later.

3. Initial Data Exploration
In this area, the data would be initially explored, and we would determine the approach of analysis model based on the nature of the dataset. The nature of our dataset focuses on time series and price related movements, careful data exploration must be done to understand the best tools to use.

Iterative Segment

4. Selecting and Deploying the Analysis Model
In this area, we would be experimenting with multiple different analysis approaches based on our initial understanding of the dataset after the exploration. It could range from forecasting to technical analysis, discovering seasonal trends and visualizations to uncover time series patterns to achieve the objectives of our client.

5. Model Validation
We would be proposing a multi-variate methodology of sampling data to validate our analysis model. In this aspect, we would be using the 2-way of approach of model validation called “train and test”.

We would also be using benchmark metrics to test our analysis models to ensure that it is satisfactory. Should it not be satisfactory, we would go back to phase 4 of model building or phase 2 to rebuild the model till the results is satisfactory.

REFERENCES

AS, B., & SK, R. (2015). Exchange Rate Forecasting using ARIMA, Neural Network and Fuzzy Neuron. Retrieved from https://pdfs.semanticscholar.org/c229/b2436364db18b9fb51cd2974b1b4d6766f02.pdf.

B. (2017). Monetary Policy. Retrieved from https://www.boj.or.jp/en/mopo/mpmdeci/mpr_2017/index.htm/

BAASHER, A. A., & FAKHR, M. W. (n.d.). FOREX Trend Classification using Machine Learning Techniques. Retrieved from https://pdfs.semanticscholar.org/3c2f/cbcb9bdc0205e924c0f2518d01864da8979a.pdf

Balsara, N. J., Chen, G., & Zheng, L. (2007). The Chinese stock market: An examination of the random walk model and technical trading rules. Quarterly Journal of Business & Economics, 46(2), 43–63.

Brewer, M. J., Butler, A., & Cooksley, S. L. (n.d.). The Relative Performance of AIC, AICC and BIC in the Presence of Unobserved Heterogeneity. Retrieved from https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12541.

Butler, M., & Kazakov, D. (2010). Particle swarm optimization of Bollinger Bands. In Swarm Intelligence (pp. 504–511), Springer, Berlin.

Jebb, A. T., Tay, L., Wang, W., & Huang, Q. (2015). Time series analysis for psychological research: Examining and forecasting change.

J Hyndman, R. (n.d.). ARIMA modelling in R. Retrieved from https://www.otexts.org/fpp/8/7

Kamruzzamana, J. and Sarkerb, R. A. (2003). Comparing ANN Based Models with ARIMA for Prediction of Forex Rates . Retrieved from https://pdfs.semanticscholar.org/959e/dc19a0dfdc94464ac7d6d1f0e2927000d565.pdf

Kiiski, J. (2009). PERFORMANCE OF RSI INVESTMENT STRATEGY ON FOREIGN EXCHANGE MARKETS. Retrieved from https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12541.

Kuepper, J. (n.d.). Technical Analysis: Indicators And Oscillators. Retrieved from https://www.investopedia.com/university/technical/techanalysis10.asp#ixzz5B2AU2GDa

Nau, R. (2017, December 14). Identifying the numbers of AR or MA terms in an ARIMA model. Retrieved from https://people.duke.edu/~rnau/411home.htm

Petrusheva, N., & Jordanoski, I. (2016). COMPARATIVE ANALYSIS BETWEEN THE FUNDAMENTAL AND TECHNICAL ANALYSIS OF STOCKS. Retrieved from http://scindeks-clanci.ceon.rs/data/pdf/2334-735X/2016/2334-735X1602026P.pdf

S. (2017). April 2017 Current Events: U.S. News. Retrieved from https://www.infoplease.com/world/2017-current-events/april-2017-current-events-us-news

Williams, O. D. (2006). Empirical Optimization of BBsFor Profitability. 1-72. Retrieved from file:///C:/Users/User/Downloads/etd2519 (1).pdf.