Difference between revisions of "Group 4 Data Prep"

From Visual Analytics and Applications
Jump to navigation Jump to search
(Created page with "<div style=background:#405777 border:#A3BFB1> 150px <font size = 5; color="#ffffff"><span style="font-family:Century Gothic;">Group 4 Project - A Tale o...")
 
 
(5 intermediate revisions by one other user not shown)
Line 5: Line 5:
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
 
{|style="background-color:#384C67;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#384C67;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="font-family:Century Gothic; font-size:100%; solid #FFFFFF; background:#5E81B1; text-align:center;" width="16%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #FFFFFF; background:#384C67; text-align:center;" width="16%" |  
 
;
 
;
 
[[Group_4_Overview| <font color="#FFFFFF">'''Overview'''</font>]]
 
[[Group_4_Overview| <font color="#FFFFFF">'''Overview'''</font>]]
| style="font-family:Century Gothic; font-size:100%; solid #FFFFFF; background:#384C67; text-align:center;" width="16%" |
+
| style="font-family:Century Gothic; font-size:100%; solid #FFFFFF; background:#5E81B1; text-align:center;" width="16%" |
 
;
 
;
 
[[Group_4_Data_Prep| <font color="#FFFFFF">'''Data Prep'''</font>]]
 
[[Group_4_Data_Prep| <font color="#FFFFFF">'''Data Prep'''</font>]]
Line 28: Line 28:
 
|}
 
|}
 
<br/>
 
<br/>
 +
 +
<font size = 4><span style="font-family:Century Gothic;font-weight:500;padding-bottom:1px; border-bottom: solid 1px black;">Data source and preparation</span></font>
 +
 +
The following are the data sets used
 +
# Per minute bitcoin price data, extracted from Kaggle: https://www.kaggle.com/mczielinski/bitcoin-historical-data
 +
# Nasdaq index: https://finance.yahoo.com/quote/%5EIXIC/history?ltr=1
 +
# S&P500: https://finance.yahoo.com/quote/%5EGSPC/history?p=%5EGSPC
 +
# Gold Price: https://finance.yahoo.com/quote/%5EXAU/history?p=%5EXAU
 +
 +
Data was easily downloaded from the link above. The bitcoin data downloaded requires data type conversion from UNIX time stamp to the more common YYYY-MM-DD format. The function [as.POSIXct] in base R easily convert the timestamp into the right format. We have also selected to use data from 2012 onwards because there is only 1-month worth of data in 2011. Excluding this data will not substantially affect our analysis.
 +
Data date range is from 1st January 2012 to 20th October 2017
 +
The following are screenshots of the original data imported
 +
 +
<center style="font-size:15px;"> <u>''Summary of data-set used''</u> </center>
 +
[[File:Summary.jpg|500px|center]]
 +
 +
<center style="font-size:15px;"> <u>''Bitcoin Data''</u> </center>
 +
[[File:Bitcoindataframes.jpg|500px|center]]
 +
 +
<center style="font-size:15px;"> <u>''Gold Price Data''</u> </center>
 +
[[File:Golddata.jpg|500px|center]]
 +
 +
<center style="font-size:15px;"> <u>''Nasdaq Index''</u> </center>
 +
[[File:Nasdaq.jpg|500px|center]]
 +
 +
<center style="font-size:15px;"> <u>''S&P 500''</u> </center>
 +
[[File:Sp500data.jpg|500px|center]]

Latest revision as of 14:51, 30 November 2017

Bitcoin.png Group 4 Project - A Tale of Bitcoin

Overview

Data Prep

Design & Built

Report

Poster

R Application

 


Data source and preparation

The following are the data sets used

  1. Per minute bitcoin price data, extracted from Kaggle: https://www.kaggle.com/mczielinski/bitcoin-historical-data
  2. Nasdaq index: https://finance.yahoo.com/quote/%5EIXIC/history?ltr=1
  3. S&P500: https://finance.yahoo.com/quote/%5EGSPC/history?p=%5EGSPC
  4. Gold Price: https://finance.yahoo.com/quote/%5EXAU/history?p=%5EXAU

Data was easily downloaded from the link above. The bitcoin data downloaded requires data type conversion from UNIX time stamp to the more common YYYY-MM-DD format. The function [as.POSIXct] in base R easily convert the timestamp into the right format. We have also selected to use data from 2012 onwards because there is only 1-month worth of data in 2011. Excluding this data will not substantially affect our analysis. Data date range is from 1st January 2012 to 20th October 2017 The following are screenshots of the original data imported

Summary of data-set used
Summary.jpg
Bitcoin Data
Bitcoindataframes.jpg
Gold Price Data
Golddata.jpg
Nasdaq Index
Nasdaq.jpg
S&P 500
Sp500data.jpg