Group10 Proposal

Title.jpg China Stock Data Visualization






The stock market data is seamless endless and widely available on the web. The movement of stock exchange depends on a complex mix of factors and difficult to predict. Exploring the patterns of stock market data, using different data visualisation skills will be largely helpful for stock market investors and traders. This project aims to provide advanced data visualisation of stock market data to reveal the hidden pattern of market movement.


Create an interface for users to directly find the trend of different stocks and predict the stock prices. For each investor, provided investment recommendations based on risk assessment and preference of investors.

Data Source

CSMAR, a comprehensive database of China stock returns, covering all companies listed on Shanghai Stock Exchange and Shenzen Stock Exchange

  • company profile
  • daily trading price/volume
  • minute trading price/volume


This project consists of 3 stages of analysis, descriptive analysis, predictive analysis and optimization analysis.

Exploratory Analysis

In the description analysis, we will visualize individual stock time series data with the ability to compare different other stocks & market index, to study the correlations and relative trend to industry/market.

Explanatory Analysis

In predictive analysis, we would like to visual a. pattern of time series data and examine the outliers b. decomposition of the trend, season, cycle & noise in time series data future stock price, using ARIMA methods to forecast the with reported accuracy and confidence interval.

Optimization Analysis

In optimization analysis, we will visualize a suggested optimal investment portfolio to minimize risk / maximize the margin, based on investors’ risk tolerance

Application Libraries & Packages

Package Name Descriptions
ggplot2 ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
tibbletime tibbletime is a new package that enables the creation of time aware tibbles. It’s sole purpose is to make working with time series in the tidyverse much easier!
tidyquant tidyquant integrates the best resources for collecting and analyzing financial data, zoo, xts, quantmod, TTR, and PerformanceAnalytics, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each.
timetk The timetk package enables a user to more easily work with time series objects in R. The package has tools for inspecting and manipulating the time-based index, expanding the time features for data mining and machine learning, and converting time-based objects to and from the many time series classes.
sweep The sweep package extends the broom tools (tidy, glance, and augment) for performing forecasts and time series analysis in the "tidyverse". The package is geared towards "tidying" the forecast workflow used with Rob Hyndman's forecast package.


