Group 4 DesignBuilt

From Visual Analytics and Applications
Jump to navigation Jump to search

Bitcoin.png Group 4 Project - A Tale of Bitcoin

Overview

Data Prep

Design & Built

Report

Poster

R Application

 


R Technology

We will be employing R to build an application to perform the necessary analytics.

R library packages used:

  1. Dplyr
  2. Prophet
  3. Shiny
  4. Tidyverse
  5. Lubridate
  6. Ggplot2
  7. Data.table
  8. Tibbletime
  9. Tidyquant
  10. Stringr
  11. KableExtra
  12. highcharter

Pacakages above were adopted for the following 3 processes: Data Manipulation, Data Visualization and Forecasting. For data manipulation, we have used: dplyr, tidyverse, lubridate, data.table, tidyquant, and stringr. For data visualization we have used: Shiny, ggplot2, Tibbletime, KableExtra and HighCharters. For forecasting, we mostly used Prophet.

R Package: Prophet

We have used the latest package in R called Prophet to conduct forecasting. Prophet has included mixed models to conduct time series forecasting, for instance, ARIMA and Exponential Smoothing. While performing historical data analysis, Prophet here will decompose the data with three main model components: trend, seasonality and holidays.

6formula.png

Here g(t) is the trend function which models non-periodic changes in the value of the time series, s(t) represents periodic changes (e.g. weekly and yearly seasonality), and h(t) represents the effects of holidays which occur on potentially irregular schedules over one or more days. The error term represents any changes that are not accommodated by the model. This specification is similar to a generalized additive model, which have a number of practical advantages: Flexibility (easy accommodate seasonality with multiple periods and let the analyst make different assumptions about trends, and measurements do not need to be regularly spaced), Fast fitting, and parameters can be interpreted easily. Last but not least, Prophet can handle high frequency data very well.

R Package: HighCharters

HighCharter is a package that can generate multiple types of plots with visually pleasant design and easy to use interactivity. What have used it is to plot out the historical bitcoin prices charts, candlestick charts and forecasting charts. HighCharters have the following advantage that can be appealing to the users. It can generate seemingly sophisticated time series data with few simple steps. It has a zoom-in scale right under the line chart (just like what we did with the historical price chart). With this zoom-in feature, it allows us to discover further details in the price movement.

Application Built and Design

The built was conceptualised around the idea on how to communicate our 2 motivations mentioned earlier in this paper. (Price patterns and trends and comparative returns)

We find the following visualisation useful to assist with our analytics

1. Time Series Chart

Time series chart with the ability to zoom in and zoom out. The bar at the bottom allows user to select the period of interest and the top chart will rescale to fit the screen. The data used is the bitcoin data set. The object type is dataframe. The data required for this chart to work is the daily close price and date. R HighCharter package is used to build this chart

tsc
Time Series Chart


2. Candle Stick Chart

This chart is useful to see how volatile the price is in a given period. For bitcoin, we are able to drill down to per minute transaction. To assist with quicker analysis, we have limited the 30 days of per minute transaction, each time the chart is refresh. Users get to select the starting point of the data, and the chart will refresh.

Per minute price close, low, high and open is required to have this chart built. R HighCharter is used to build this chart

input
Candlestick Input
cds
Candlestick Chart


3. Auto Correlation Chart

This chart helps to uncover any evidence of cyclical effects. We used ACF function which comes with based R. The chart is then plotted using the HighCharter package.

There are two variable inputs, being lags (i.e. shifting the time period forward) and date range. Date range is an important variable. Because the underlying is a time series data, cyclical effects may change depending on the window which you are observing. Thus, it is also important to have this as a variable.

R ACF() function is used to compute the different lag points. Lags can be specifically defined in the ACF function. Sample code is as below:

acf(corr_data, lag.max = input$slider_corr, plot = FALSE)

For the example above, we have a slider input by R Shiny which will send a variable number into the ACF function.

R HighCharter is then used to build the chart below.

acf
Auto Correlation Chart


4. Price and Standard Deviation Facet by Year

Facet panels for multi-year comparatives. The first row is the price chart. The Y-axis scale is left to be auto adjusted so that patterns will be more apparent.

The 2nd row is standard deviation group by the respective year. Same as the price chart, the Y-axis has been left to be autoadjusted

We have used ggplot2 to build the below

facet
Price and Standard Deviation Facet by Year


5. Comparatives

Simple time series chart for comparative against other commodities. 5 year and 2 year base year comparative values were computed and stored. The scale factor is added to allow users to view the trend of the underlying comparative. It is required due to the massive return value of bitcoin. The first chart is an index value based on a certain based year. The second chart is index over standard deviation for either 5yr or 2yr period

index1
Index of Value by Year
index_sd
Index of Standard Deviation


6. Forecasting

Another chart of interest will be the forecasted values of bitcoin. We have utilised R package "Prophet" to provide the different forecasted data points. And the chart is build based on R HighCharter.

fc
Time Series Forecasting