42 Introduction
R studio contains lots of powerful packages to solve real-world data visualization problems. In this tutorial, we will be focusing on how to combine the data visualization library we learned in class(like “ggplot2”, “tidyverse”, “ploty”, et al) with some new packages (like “quantmod”, “TSstudio”, and fpp2) to solve the Time series visualization problem.
Before going into how to do a time series analysis with R, let’s think about a basic question: What is time series analysis? Why should anyone learn it?
A time series is a set of numerical measurements of the same entity taken at equally spaced intervals over time. For example, Amazon stock is the entity, the stock price of Amazon is the numerical measurement, and the set of the everyday stock price of Amazon of 2022 August is a time series.
Time Series analysis is usually used to find the pattern of the non-stationary data and use the pattern for future prediction. By visualizing time series with R, we can observe the data more intuitively, which can help us find the pattern of non-stationary data more efficiently.
Time Series Analysis is widely used in many industries like Finance, Economics, and Retailing. This tutorial will focus on how to use R to visualize the Time Series of stock prices.
The tutorial will contains 5 Sections:
1. Environment Setup
2. Raw Data Process
3. Time series visualization: ggplot and ggplot2
4. Time series visualization: TSstudio
5. Time series visualization: Plotly
Each section contains a description of the function used in that section and a practical example. The practical example we use in our tutorial is the Time series visualization of Google and Amazon Stock from 2021-07-01 to 2021-08-31.
42.1 Section 1: Environment Setup
First and foremost, we need to set up our R environment before doing anything else.
The package we will use in this tutorial are: “quantmod”, “TSstudio”, “xts”, “ggplot2”, “gridExtra”, “fpp2”, “tidyverse”, “plotly”
If there is any package is not yet been installed/updated, please type the following code in your R terminal: install.packages(“package-name”)
Coding Time:
42.2 Section 2: Raw Data Processing
Before any visualization, it’s always important to extract and process the raw data.
This part will have 2 subopic:
2.1:Stock Data Extraction with quantmod
2.2:Basic Stock Data transformation: Log transformation
2.1:Stock Data Extraction with quantmod
For data analysts, there is always a tough problem that needs to solve: where does the raw data come from? This tutorial will show you how to easily extract the high credibility stock data with “quantmod” package.
Based on the author of the “quantmod”, this package is designed to ‘assist the quantitative trader in the development, testing, and deployment of statistically based trading models.’
The function getSymbols() from quantmod package is the method we used in this tutorial to extract the time series of a specific stock.
The getSymbols() function provides an interface that imports data as an xts object. By default, it imports data from Yahoo! Finance. Use the ’’ to specify the stock code, and ‘from=’ & ‘t=’ to specify the date interval.
Practical Example:
We will extract Google stock price from 2021-07-01 to 2022-08-31 using quantmod package in R. The quantmod package contains functions to extract, chart, and analyze quantitative trading data. In this case, we extract the adjusted price on the 6th row to create visualization.
We also extract Amazon stock price from 2021-07-01 to 2022-08-31 for further visualization. Extract all stock data from Yahoo! Finance.
Coding Time:
sdate = as.Date('2021-07-01')
edate = as.Date('2022-08-31')
sdata = getSymbols('GOOG',from = sdate,t = edate,auto.assign = F)
no.na <- which(is.na(sdata[,6])) # no for NA
sdata[no.na,6] <- sdata[no.na-1,6]
## GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume GOOG.Adjusted
## 2021-07-01 124.8497 126.4625 124.8497 126.3685 17120000 126.3685
## 2021-07-02 126.8395 128.8480 126.7690 128.7190 21160000 128.7190
## 2021-07-06 129.4495 129.8845 128.4090 129.7710 21350000 129.7710
## 2021-07-07 130.3410 130.6399 129.7600 130.0775 16680000 130.0775
## 2021-07-08 128.2500 130.0325 128.0400 129.1770 19780000 129.1770
## 2021-07-09 128.9445 129.8495 128.9435 129.5745 15106000 129.5745
#extract only the adjusted price in 6th row
s_price <- sdata[,6]
#extract stock price data in Amazon
sdata_amzn = getSymbols('AMZN',from = sdate,t = edate,auto.assign = F)
no.na_amzn <- which(is.na(sdata_amzn[,6]))
sdata_amzn[no.na,6] <- sdata_amzn[no.na_amzn-1,6]
s_price_amzn <- sdata_amzn[,6]
## AMZN.Open AMZN.High AMZN.Low AMZN.Close AMZN.Volume AMZN.Adjusted
## 2021-07-01 171.7305 172.8500 170.4710 171.6485 40742000 171.6485
## 2021-07-02 172.5820 175.5860 171.8460 175.5490 63388000 175.5490
## 2021-07-06 176.5055 184.2740 176.4500 183.7870 134896000 183.7870
## 2021-07-07 185.8690 186.7100 183.9455 184.8290 106562000 184.8290
## 2021-07-08 182.1780 187.9995 181.0560 186.5705 103612000 186.5705
## 2021-07-09 186.1260 187.4000 184.6700 185.9670 74964000 185.9670
2.2:Basic Stock Data transformation: Log transformation
Log-transformation scaling is typically used before analyzing stock price data. After log transformation, equivalent price changes can be represented by the same vertical distance. Log returns are more symmetric, and works better than linear price scales to observe the relative change of price, instead of absolute change. It helps to visualize how far the price moves to reach a sell or buy target.
Practical Example:
Use log-transformation to transfer the Google and Amazon stock price we got in 2.1
Coding Time:
## GOOG.Adjusted
## 2021-07-01 NA
## 2021-07-02 0.018429445
## 2021-07-06 0.008139641
## 2021-07-07 0.002359091
## 2021-07-08 -0.006946847
## 2021-07-09 0.003072379
## AMZN.Adjusted
## 2021-07-01 NA
## 2021-07-02 0.022469408
## 2021-07-06 0.045859311
## 2021-07-07 0.005653552
## 2021-07-08 0.009378116
## 2021-07-09 -0.003239951