2.5 Financial Data

Financial data are usually time series. Data are often indexed by date and even time. Date are usually converted from character data type using functionas.Date() whiel date-time data are converted using function as.POSIXct().

Without time, the program will put in hours.

as.POSIXct("03-12-2014",format="%d-%m-%Y")
## [1] "2014-12-03 +08"

The package xts is useful to handle financial data.

install.packages("xts")
library(xts)

Create an xts object using the xts() function:

dates<-as.Date(c("2016-01-01","2016-01-02",
                 "2016-01-03"))
prices <- c(1.1,2.2,3.3)
x <- xts(prices, order.by=dates)

Each observation has time timestamp.

x[1]
##            [,1]
## 2016-01-01  1.1

With timestamp, we can find the first and last times:

first(x)
##            [,1]
## 2016-01-01  1.1
last(x)
##            [,1]
## 2016-01-03  3.3

To get time stamp, we use time() and to get the value, we use as.numeric():

time(x[1])
## [1] "2016-01-01"
as.numeric(x[1])
## [1] 1.1

The most common time series operation is lag(). It moves your data ahead.

lag1_x <- lag(x,1)
lag1_x
##            [,1]
## 2016-01-01   NA
## 2016-01-02  1.1
## 2016-01-03  2.2
lag2_x <- lag(x,2)
lag2_x
##            [,1]
## 2016-01-01   NA
## 2016-01-02   NA
## 2016-01-03  1.1

The reason that you need to use lag is that first difference cannot be calculated by x[i]-x[i-1]. Instead we need to use x-lag(x)

To transfer timestamp to another vector, we use reclass(). The following code copy the timestamp from x to y:

y <- c(1,0,-1)
y <- reclass(y,x)
y
##            [,1]
## 2016-01-01    1
## 2016-01-02    0
## 2016-01-03   -1

Two xts objects with the same timestamps can be combined using cbind()

z <- cbind(x,y)
z
##              x  y
## 2016-01-01 1.1  1
## 2016-01-02 2.2  0
## 2016-01-03 3.3 -1

Sometimes, some entries in the data are missing and there will be calculation problems for time series calculation (e.g. lag operator Lag()). There are using two ways to deal with the problem:

  1. na.omit() to take away those data points, and
  2. na.approx() to to take linear approximation.

Note that na.approx() is only available (and also only meaningful) for time series data.