Chapter 13 VEC and VAR Models
rm(list=ls()) #Removes all items in Environment!
library(tseries) # for `adf.test()`
library(dynlm) #for function `dynlm()`
library(vars) # for function `VAR()`
library(nlWaldTest) # for the `nlWaldtest()` function
library(lmtest) #for `coeftest()` and `bptest()`.
library(broom) #for `glance(`) and `tidy()`
library(PoEdata) #for PoE4 datasets
library(car) #for `hccm()` robust standard errors
library(sandwich)
library(knitr) #for `kable()`
library(forecast)
New package: vars
(Pfaff 2013).
When there is no good reason to assume a one-way causal relationship between two time series variables we may think of their relationship as one of mutual interaction. The concept of “vector,” as in vector error correction refers to a number of series in such a model.
13.1 VAR and VEC Models
Equations 1 and 1 show a generic vector autoregression model of order 1, VAR(1), which can be estimated if the series are both I(0). If they are I(1), the same equations need to be estimated in first differences.
yt=β10+β11yt−1+β12xt−1+νyt xt=β20+β21yt−1+β22xt−1+νxtIf the two variables in Equations 1 and 2 and are cointegrated, their cointegration relationship should be taken into account in the model, since it is valuable information; such a model is called vector error correction. The cointegration relationship is, remember, as shown in Equation 3, where the error term has been proven to be stationary.
yt=β0+β1xt+et13.2 Estimating a VEC Model
The simplest method is a two-step procedure. First, estimate the cointegrating relationship given in Equation 3 and created the lagged resulting residual series ˆet−1=yt−1−b0−b1xt−1. Second, estimate Equations 4 and 5 by OLS.
Δyt=α10+α11+ˆet−1+νyt Δxt=α20+α21+ˆet−1+νxtThe following example uses the dataset gdp, which includes GDP series for Australia and USA for the period since 1970:1 to 2000:4. First we determine the order of integration of the two series.
data("gdp", package="PoEdata")
gdp <- ts(gdp, start=c(1970,1), end=c(2000,4), frequency=4)
ts.plot(gdp[,"usa"],gdp[,"aus"], type="l",
lty=c(1,2), col=c(1,2))
legend("topleft", border=NULL, legend=c("USA","AUS"),
lty=c(1,2), col=c(1,2))

Figure 13.1: Australian and USA GDP series from dataset ‘gdp’
Figure 13.1 represents the two series in levels, revealing a common trend and, therefore, suggesting that the series are nonstationary.
adf.test(gdp[,"usa"])
##
## Augmented Dickey-Fuller Test
##
## data: gdp[, "usa"]
## Dickey-Fuller = -0.9083, Lag order = 4, p-value = 0.949
## alternative hypothesis: stationary
adf.test(gdp[,"aus"])
##
## Augmented Dickey-Fuller Test
##
## data: gdp[, "aus"]
## Dickey-Fuller = -0.6124, Lag order = 4, p-value = 0.975
## alternative hypothesis: stationary
adf.test(diff(gdp[,"usa"]))
##
## Augmented Dickey-Fuller Test
##
## data: diff(gdp[, "usa"])
## Dickey-Fuller = -4.293, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
adf.test(diff(gdp[,"aus"]))
##
## Augmented Dickey-Fuller Test
##
## data: diff(gdp[, "aus"])
## Dickey-Fuller = -4.417, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
The stationarity tests indicate that both series are I(1), Let us now test them for cointegration, using Equations 6 and 7.
aust=β1usat+et ˆet=aust−β1usatcint1.dyn <- dynlm(aus~usa-1, data=gdp)
kable(tidy(cint1.dyn), digits=3,
caption="The results of the cointegration equation 'cint1.dyn'")
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
usa | 0.985 | 0.002 | 594.787 | 0 |
ehat <- resid(cint1.dyn)
cint2.dyn <- dynlm(d(ehat)~L(ehat)-1)
summary(cint2.dyn)
##
## Time series regression with "ts" data:
## Start = 1970(2), End = 2000(4)
##
## Call:
## dynlm(formula = d(ehat) ~ L(ehat) - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4849 -0.3370 -0.0038 0.4656 1.3507
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## L(ehat) -0.1279 0.0443 -2.89 0.0046 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.598 on 122 degrees of freedom
## Multiple R-squared: 0.064, Adjusted R-squared: 0.0564
## F-statistic: 8.35 on 1 and 122 DF, p-value: 0.00457
Our test rejects the null of no cointegration, meaning that the series are cointegrated. With cointegrated series we can construct a VEC model to better understand the causal relationship between the two variables.
vecaus<- dynlm(d(aus)~L(ehat), data=gdp)
vecusa <- dynlm(d(usa)~L(ehat), data=gdp)
tidy(vecaus)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 0.491706 | 0.057909 | 8.49094 | 0.000000 |
L(ehat) | -0.098703 | 0.047516 | -2.07727 | 0.039893 |
tidy(vecusa)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 0.509884 | 0.046677 | 10.923715 | 0.000000 |
L(ehat) | 0.030250 | 0.038299 | 0.789837 | 0.431168 |
The coefficient on the error correction term (ˆet−1) is significant for Australia, suggesting that changes in the US economy do affect Australian economy; the error correction coefficient in the US equation is not statistically significant, suggesting that changes in Australia do not influence American economy. To interpret the sign of the error correction coefficient, one should remember that ˆet−1 measures the deviation of Australian economy from its cointegrating level of 0.985 of the US economy (see Equations 6 and 7 and the value of β1 in Table 13.1).
13.3 Estimating a VAR Model
The VAR model can be used when the variables under study are I(1) but not cointegrated. The model is the one in Equations ???, but in differences, as specified in Equations 8 and 9.
Δyt=β11Δyt−1+β12Δxt−1+νΔyt Δxt=β21Δyt−1+β22Δxt−1+νΔxtLet us look at the income-consumption relationship based on the fred detaset, where consumption and income are already in logs, and the period is 1960:1 to 2009:4. Figure 13.2 shows that the two series both have a trend.
data("fred", package="PoEdata")
fred <- ts(fred, start=c(1960,1),end=c(2009,4),frequency=4)
ts.plot(fred[,"c"],fred[,"y"], type="l",
lty=c(1,2), col=c(1,2))
legend("topleft", border=NULL, legend=c("c","y"),
lty=c(1,2), col=c(1,2))

Figure 13.2: Logs of income (y) and consumption (c), dataset ‘fred’
Are the two series cointegrated?
Acf(fred[,"c"])
Acf(fred[,"y"])
adf.test(fred[,"c"])
##
## Augmented Dickey-Fuller Test
##
## data: fred[, "c"]
## Dickey-Fuller = -2.62, Lag order = 5, p-value = 0.316
## alternative hypothesis: stationary
adf.test(fred[,"y"])
##
## Augmented Dickey-Fuller Test
##
## data: fred[, "y"]
## Dickey-Fuller = -2.291, Lag order = 5, p-value = 0.454
## alternative hypothesis: stationary
adf.test(diff(fred[,"c"]))
##
## Augmented Dickey-Fuller Test
##
## data: diff(fred[, "c"])
## Dickey-Fuller = -4.713, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(diff(fred[,"y"]))
##
## Augmented Dickey-Fuller Test
##
## data: diff(fred[, "y"])
## Dickey-Fuller = -5.775, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
cointcy <- dynlm(c~y, data=fred)
ehat <- resid(cointcy)
adf.test(ehat)
##
## Augmented Dickey-Fuller Test
##
## data: ehat
## Dickey-Fuller = -2.562, Lag order = 5, p-value = 0.341
## alternative hypothesis: stationary


Figure 13.3: Correlograms for the series c and y, dataset fred
Figure 13.3 shows a long serial correlation sequence; therefore, I will let R calculate the lag order in the ADF test. As the results of the above adf and cointegration tests show, the series are both I(1) but they fail the cointegration test (the series are not cointegrated.) (Plese rememebr that the adf.test
function uses a constant and trend in the test equation; therefore, the critical values are not the same as in the textbook. However, the results of the tests should be the same most of the time.)
library(vars)
Dc <- diff(fred[,"c"])
Dy <- diff(fred[,"y"])
varmat <- as.matrix(cbind(Dc,Dy))
varfit <- VAR(varmat) # `VAR()` from package `vars`
summary(varfit)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: Dc, Dy
## Deterministic variables: const
## Sample size: 198
## Log Likelihood: 1400.444
## Roots of the characteristic polynomial:
## 0.344 0.343
## Call:
## VAR(y = varmat)
##
##
## Estimation results for equation Dc:
## ===================================
## Dc = Dc.l1 + Dy.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Dc.l1 0.215607 0.074749 2.88 0.0044 **
## Dy.l1 0.149380 0.057734 2.59 0.0104 *
## const 0.005278 0.000757 6.97 4.8e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.00658 on 195 degrees of freedom
## Multiple R-Squared: 0.12, Adjusted R-squared: 0.111
## F-statistic: 13.4 on 2 and 195 DF, p-value: 3.66e-06
##
##
## Estimation results for equation Dy:
## ===================================
## Dy = Dc.l1 + Dy.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## Dc.l1 0.475428 0.097326 4.88 2.2e-06 ***
## Dy.l1 -0.217168 0.075173 -2.89 0.0043 **
## const 0.006037 0.000986 6.12 5.0e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.00856 on 195 degrees of freedom
## Multiple R-Squared: 0.112, Adjusted R-squared: 0.103
## F-statistic: 12.3 on 2 and 195 DF, p-value: 9.53e-06
##
##
##
## Covariance matrix of residuals:
## Dc Dy
## Dc 0.0000432 0.0000251
## Dy 0.0000251 0.0000733
##
## Correlation matrix of residuals:
## Dc Dy
## Dc 1.000 0.446
## Dy 0.446 1.000
Function VAR()
, which is part of the package vars
(Pfaff 2013), accepts the following main arguments: y=
a matrix containing the endogenous variables in the VAR model, p=
the desired lag order (default is 1), and exogen=
a matrix of exogenous variables. (VAR
is a more powerful instrument than I imply here; please type ?VAR()
for more information.) The results of a VAR model are more useful in analysing the time response to shocks in the variables, which is the topic of the next section.
13.4 Impulse Responses and Variance Decompositions
Impulse responses are best represented in graphs showing the responses of a VAR endogenous variable in time.
impresp <- irf(varfit)
plot(impresp)


Figure 13.4: Impulse response diagrams for the series c and y, dataset fred
The interpretation of Figures 13.4 is straightforward: an impulse (shock) to Dc at time zero has large effects the next period, but the effects become smaller and smaller as the time passes. The dotted lines show the 95 percent interval estimates of these effects. The VAR function prints the values corresponding to the impulse response graphs.
plot(fevd(varfit)) # `fevd()` is in package `vars`

Figure 13.5: Forecast variance decomposition for the series c and y, dataset fred
Forecast variance decomposition estimates the contribution of a shock in each variable to the response in both variables. Figure 13.5 shows that almost 100 percent of the variance in Dc is caused by Dc itself, while only about 80 percent in the variance of Dy is caused by Dy and the rest is caused by Dc. The R function fevd()
in package vars
allows forecast variance decomposition.
References
Pfaff, Bernhard. 2013. Vars: VAR Modelling. https://CRAN.R-project.org/package=vars.