IS480 Team wiki: 2016T1 Charlies-Angels-Algorithm

From IS480
Jump to navigation Jump to search

Charlies Angels Logo.png

Charlies Angels Home.png  HOME


Charlies Angels About Us.png  ABOUT US


Charlies Angels Project Overview.png  PROJECT OVERVIEW


Charlies Angels Project Management.png  PROJECT MANAGEMENT


Charlies Angels Project Documentation.png  DOCUMENTATION

X-Factor Project Description & Motivation Charlies Angels Current Stage.png Algorithm Scope Architecture & Technologies

Pseudo Codes

Pseudo Codes

The following CPF Calculations are a new feature that is a never before seen form of calculation in the industry, where we take into account CPF and Housing Loan information to better help the User plan its investment strategy in order to retire comfortably at his/her goals.

CPF Calculations

  • 1. Retrieve all variables from cpf.csv
  • 2. Loop through all the portfolio that the user has and calculate the dividend yield amount and realise profit for every year since the creation of each portfolio.
  • 3. Retrieve user’s CPF variables from database
  • 4. Calculate amount needed from current age to death age by looping current age to death age.
    • a. For each year, multiply inflation rate with target pension.
    • b. Sum them up and you get the amount needed from current to death age.
  • 5. Next, calculate the total amount of money in ordinary and special account when retirement account opens. This is done by looping from current age to open retirement account.
    • a. For each age, check if the age is below 35/45/50 and multiply by the respective percentage as different age has different contribution.
    • b. While looping, the amount of house loan needed to pay per month is subtracted from ordinary account.
    • c. In addition, ordinary, special and medisave accounts are multiplied by their respective interest rate.
    • d. As the amount of monthly salary that can be contributed to CPF is capped at 6000, a check has to be done to make sure that the maximum contribution to CPF is capped.
    • e. In the event that medisave hits the minimum cap needed, which is 49800 as of 2016, the rest of the contribution to medisave will be flowed over to special account. However, if the amount of money in special account has hit the FRS sum, which is 161000 as of 2016, the rest of the contribution to medisave will be flowed over to ordinary account.
  • 6. The user’s savings in bank is also looped from current age to retirement age to calculate the interest rate. The amount of money is multiplied by the bank’s interest rate each year.
  • 7. At age 55, retirement account is opened.
    • a. The amount of money in retirement account is calculated by adding the amount of money in special and ordinary account.
    • b. A condition is done to check if the user has hit the minimum amount of BRS/FRS in the retirement account.
      • i. If the user has enough money to hit BRS but not FRS, any amount more than BRS sum will be transferred over to his/her bank account.
      • ii. If the user has enough money to hit FRS, any amount more than FRS sum will be transferred over to his/her bank account.
      • iii. In the event that the user does not have enough money for both BRS/FRS, nothing will happen and life goes on.
  • 8. Next, a loop is used to calculate the amount of years that the user’s money in both CPF and bank account can last.
    • a. This is to calculate how many years the user’s money can last if he/she chose not to do anything about it besides working till retirement.
  • 9. Lastly, loop from open retirement account age to pay out age. This is to calculate the amount of money in retirement age with interest rate. It is done by multiplying the amount of money in retirement account with retirement account interest rate.
  • 10. After all the calculations, a crawling is done to obtain STI 10 years return.
    • a. The returns allow the user to know how much he/she has to invest in STI in order to achieve his/her retirement goals.

For more information such as the variables involved in these calculation, please view our document here: Pseudo Code for CPF Calculations

Scheduler (Data Crawling)

On a particular date & time, cronjob (a process by Linux) will automatically execute a script file that contains a java command. The java command will run the class upon executing and the following code will run.

  • Loop the list of symbols
  • For each symbol, access the website URL
    • If unable to access website URL, use user agent/proxy
  • Select the code pattern that we have identified to scrape
  • Process the texts that were scrapped
  • If text is not what we wanted, ignore it.
  • If text is what we wanted,
    • Initialise an object class that stores all the retrieved information
    • Add the object class into an array list
    • Pass the array list to a data access object class
    • Loop the array list and add all the information into the database

This pseudo code will run for each information that needs to be crawled/scraped off the website.
As we need to modify a file called crontab, the following statements have to be added into the file.
  1. 0 0 1 4 * XX (yearly task)
  2. */5 9-16 * * XX (five minutes’ task)
  3. 0 0 1 3/3 * XX (quarterly task)
  4. 0 5 * * * XX (daily task)

^ Where XX is the path/location of the script

For more information such as the pattern recognition that the system utilizies, please view our document here: Pseudo Code for Scheduler (Data Crawling)

Backend Development [Crawling]

In order for Stockbook application to receive live data without accessing paid APIs, our team has written several crawling functions using Java in our application that does scrapping of data in real time. Even though we are able to access yahoo’s free API for both historical and current data, the amount of information provided is very limited. Hence, this explains the need of having crawling functions to scrape data off several websites.

The list of websites that our application currently scrapes from are:

As we know that scraping data off the internet can be dangerous (it tends to break very often) and tedious, we have written a secondary scraper and it will be activated should the first scraper fails. This has added onto the technical complexity of having crawling functions.

Pseudo Code:

Upon clicking a function that requires real time data, the following code will run and scrape data off the websites stated above. A rough pseudo code is provided below:

1. Access the website.
2. Select the relevant tags based on the pattern given in this document: Pseudo Code for Backend
3. Retrieve the data needed.
4. If data is not correct or link is broken.
a. Activate secondary crawler and repeat 1-3
5. If data is correct and link is not broken, return data.

This pseudo code will run for each information that needs to be scraped off the website.

For more details on the Pseudo Code for Backend Crawling of Live Data (e.g. STI, etc). Please refer to Charlies Angels Pseudo Code for Backend Dev.

Backend Updater

In order for alert function to work in Stockbook Application, our team has written an updater program in Java that will be executed via cronjob.

The purpose of this function is discussed below. Also, there are two pseudo codes provided below to further explain this function.

Pseudo Code for Refresh Alert Task

For refresh alert task, it will be executed via cronjob at 5am every day. This refreshes the alert status from “yes” to “no”, if any. This is because the function of alert is to send an email ONCE a day if any of the alert hits the alert price/volume/news. If an email is sent to the user, the alert status will be changed to “yes”. In order for the user to receive another email for the same alert, refresh alert task has to be executed to change the status from “yes” to “no”.

  1. Loop through all the users in the user table.
  2. Change all alert status to “no” for all the users retrieved.

Pseudo Code for Alert Task

For alert task, it will be executed via cronjob at 9am every day. It will run into an endless loop and stops only when the time passes 5pm (market closes). For each loop, it will check the alert price/volume/news and see if it matches with the current price/volume/alert. If it matches, an email will be sent to the user.

  1. Retrieve and loop through all symbols in database.
  2. Retrieve all news for all the symbol retrieved.
  3. Set time to stop at 1700hrs.
  4. Loop endlessly and stop only if the time is after 1700hrs.
  5. For each loop:
    • Retrieve all stock prices for all the retrieved symbols.
    • Retrieve all the users that have their notifications turned on.
    • For all the users retrieved in b, retrieve all their alerts and email addresses.
      • For each alert, check the current price/volume/news to see if it matches with the alert price/volume/news.
        1. If matches, send an email with the alert.
        2. If it does not match, ignore and keep monitoring.

As we need to modify a file called crontab, the following statements have to be added into the file.

  1. 0 5 * * * XX (refresh alert task)
  2. 0 9 * * * XX (alert task)

*where XX is the path/location of the script

For more details, please refer to the following document: Charlies Angels Pseudo Code for Updater

Formula List

Formula List

During the course of development for this project, the team has been exposed to numerous formulas that are required for the calculations of certain attributes in a User's portfolio. The two highlights for us are:

  • Average Annual Return
  • Year to Date Return

The following sub-sections would explain on how we achieve the formula for the two:

Average Annual Return

[(Total Value Sold + Total Current Value – Total Value Bought/Total Value Bought) +1 )^12/n - 1] * 100
Where n represents the number of months your portfolio exist
This formula represents the Average return of investment in a year. And the formula is calculated using Geometric Average.

Year to Date Return

Total Value of Stocks Sold + Total Current Value – Total Value of Stocks bought on first trading day of the year + sumOfYearToDateDividend / Total Value of Stocks bought on first trading day of the year * 100
YTD return refers to the amount of profit made by an investment since the first day of the calendar year.

More formulas and explanations can be found at this following document: Charlies Angels Formula List

Geometric Average

There is known to be two forms of calculation that can be taken into consideration which are namely Arithmetic Average and Geometric Average. Upon extensive research, for the goal of achieving accuracy so as to aid the user with making more informed choices. The team has decided to adopt Geometric Average for all of its calculations.

Explanation of Geometric Average is as follows:
Suppose you have invested your savings in the stock market for five years. If your returns each year were 90%, 10%, 20%, 30% and -90%, what would your average return be during this period? Well, taking the simple arithmetic average, you would get an answer of 12%. Not too shabby, you might think.

However, when it comes to annual investment returns, the numbers are not independent of each other. If you lose a ton of money one year, you have that much less capital to generate returns during the following years, and vice versa. Because of this reality, we need to calculate the geometric average of your investment returns in order to get an accurate measurement of what your actual average annual return over the five-year period is.

To do this, we simply add one to each number (to avoid any problems with negative percentages). Then, multiply all the numbers together, and raise their product to the power of one divided by the count of the numbers in the series. And you're finished - just don't forget to subtract one from the result!

That's quite a mouthful, but on paper it's actually not that complex. Returning to our example, let's calculate the geometric average: Our returns were 90%, 10%, 20%, 30% and -90%, so we plug them into the formula as [(1.9 x 1.1 x 1.2 x 1.3 x 0.1) ^ 1/5] - 1. This equals a geometric average annual return of -20.08%. That's a heck of a lot worse than the 12% arithmetic average we calculated earlier, and unfortunately it's also the number that represents reality in this case.