9.2 Cointegration

  • It is very common to analyze the cointegration (short-run and long-run relationship) between two or more time-series using so called Error Correction Model (ECM)

  • Assuming that single RHS variable xt is exogenous, ECM can be obtained as reparametrized ARDL(1,1) by replacements λ=β1+β2 and γ=β31

yt=β0+β1xt+β2xt1+β3yt1+utΔyt+yt1=β0+β1(Δxt+xt1)+β2xt1+β3yt1+utΔyt=β0+β1Δxt+(β1+β2)xt1+(β31)yt1+utΔyt=β0+β1Δxt+λxt1+γyt1+ut

       where Δyt and Δxt are first differences of yt and xt, while xt1 and yt1 are lagged values of yt and xt

  • Equation (9.14) can be additionally rearranged in the following way Δyt=β0+β1Δxt+γ(yt1+λγxt1)+utΔyt=β0+β1shortrunΔxt(1β3correction)(yt1β1+β21β3longrunxt1)disequilibrium+ut

  • Parameter γ is expected to be negative due to correction of short-run disequilibrium in every period for (1β3)100%. That’s why parameter γ is known as speed of adjustment or correction

  • The last equation in (9.14) provides the same information as ARDL(1,1) model - the first equation in (9.14)

  • Short-run effect is β1

  • Speed of adjustment is γ=(β31)=(1β3)

  • Long-run effect is λγ=β1+β21β3

  • ECM can be estimated in one step as a single equation as in (9.14)

  • Alternative is two step estimation approach proposed by Engle and Granger

Nobel Prize winners in 2003 for their contribution in time-series econometrics

FIGURE 9.1: Nobel Prize winners in 2003 for their contribution in time-series econometrics

Engle-Granger approach is a two step approach in finding cointegration between two time-series

Step 1) Assuming that both time-series yt and xt are nonstationary, a static model between two nonstationary time-series is estimated in the first step yt=β0+β1xt+utytI(1)xtI(1)utI(0)

If the error terms of the static model ut are stationary, or at least the lower integration order than yt and xt, we conclude that cointegration exist (the null hypothesis of the ADF test of the residuals is rejected).

Step 2) If cointegration exist, meaning that two time-series share a long-term equilibrium relationship while deviations from this equilibrium in the short-term are corrected over time, ECM is estimated in the second step Δyt=α0+α1Δxt+γˆut1+etΔytI(0)ΔxtI(0)ˆut1I(0)

    where ˆut1 are lagged residuals from a static model obtained as ˆut1=yt1β0β1xt1

  • If cointegration exist, it means that the time-series yt and xt are related in both the short-run and the long-run, assuming the strict exogeneity of xt (variable xt causes yt but not the other way around, which implies that time-series xt and error terms ut are independent), and hence a static model presents the equilibrium or cointegration equation, while the ECM presents short-run equation which also includes the correction of short-run disequilibrium.

  • Therefore, parameter β1 in the static model (9.16) presents the lon-run effect, parameter α1 in the ECM presents the short-run effect, while parameter γ presents the correction of disequilibrium (speed of adjustment)

  • However, if cointegration does not exist it means that time-series xt and yt are related only in the short-run, which means that a static model is spurious (there is no long-run relationship), and instead of the ECM, the model in the first differences should be estimated Δyt=α0+α1Δxt+et

  • Parameter α1 in the model (9.18) is the short-run effect

  • Short-run model (9.18) is the special case of the ECM (9.17) when γ=0

  • Cointegration exist only if time-series don’t drift apart from each other, as illustrated on the figure

Cointegrated and non-cointegrated time-series

FIGURE 9.2: Cointegrated and non-cointegrated time-series

Exercise 42. Separate the GDP growth rate and the production volume from the data frame indicators as a single time-series objects. Display both time-series growth and production on a grid with 2 columns and 1 row. Using the ur.df() command from the urca package, perform ADF test in the levels and in the first differences on growth and production to check the order of integration I(d). If both time-series are nonstationary and with the same integration odrder, e.g. I(1) or I(2), compute the ADF test in the levels of residuals from the static model (for that purpose first extract residuals from a static model using command resid() and latter compute ADF test in the levels type=“none”). If cointegration exist estimate error correction model ecm using dynlm() command from the dynlm package. Within dynlm() command, the first differences are computed using the difference operator d() and the lagged values are computed using lag operator L(). Present the results of the static model and ecm in a single table using modelsummary() command. Determine the short-run effect, the long-run effect and the speed of adjustment.
Solution

Copy the code lines below to the clipboard and paste them into an R Script file opened in RStudio. In this example ADF test is performed several times to check (non)stationarity of the growth, the production, and the residuals from the static model. Non rejection of the ADF null hypothesis in the levels and it’s rejection in the first differences for both time-series, indicates that growth and production are nonstationary in the levels, but stationary in the first differences. Therefore, both time-series are integrated of the same order I(1). However, residuals from the static model are stationary in the levels (ADF null hypothesis is rejected), which means they are integrated of order zero I(0). Following table summarizes the results of ADF tests.

Time-series ADF test in the levels ADF in the first differences
growth         1.711         4.768
production         2.065         5.684
residuals         2.919
Consequently, we can conclude that cointegration between growth and productioin exist. Therefore, we proceed with ecm.
# Separating "growth" and "production" from a data frame "indicators" as a single time-series objects
growth=ts(indicators[,"growth"],frequency=4,start=c(2000,1)) 
production=ts(indicators[,"production"],frequency=4,start=c(2000,1)) 

# Plotting the two time-series side by side on a grid with 2 columns and 1 row
layout(matrix(c(1:2), nrow=1))
ts.plot(growth, main="GDP growth of China", xlab="",ylab="growth rate in %")
ts.plot(production, main="Production of China", xlab="", ylab="volume index")

library(urca) # loading "urca" package (required only in a new session)
# Performing ADF test in the levels and in the first differences for "growth"
# Time-series oscillates around a non-zero mean but does not exhibit any trending behavior (type="drift")
summary(ur.df(growth, type="drift", selectlags="AIC"))  # ADF test in the levels with drift
summary(ur.df(diff(growth), type="drift", selectlags="AIC"))  # ADF test in the first differences with drift

# Performing ADF test in the levels and in the first differences for "production"
summary(ur.df(production, type="drift", selectlags="AIC"))  # ADF test in the levels with drift
summary(ur.df(diff(production), type="drift", selectlags="AIC"))  # ADF test in the first differences with drift

# Extracting the "residuals" from the "static" model as a single time-series object
residuals=ts(resid(static),frequency=4,start=c(2000,1))
# Performing ADF test in the levels for the "residuals" without drift and without trend (type="none")
summary(ur.df(residuals, type="none", selectlags="AIC")) 

# Estimating ECM proposed by Engle and Granger
ecm=dynlm(d(growth)~d(production)+L(residuals))

# Presenting the results of both models  ("static" and "ecm")
modelsummary(list("Static (long-run)"=static,"ECM"=ecm),stars=TRUE,fmt=4)

# Long-run effect is the second coefficient from the "static" model
coef(static)[2]

# Short-run effect is the second coefficient from the "ecm"
coef(ecm)[2]

# Speed of adjustment is the third coefficient from the "ecm"
coef(ecm)[3]

   

Exercise 43. Assume no cointegration between growth and production, implying that two time-series are related only in the short-run (but not in the long-run). In that case residuals from the static model would be nonstationary and integrated of order one I(1) just like growth and production. In this scenario only short-run model short should be estimated.
Solution Copy the code lines below to the clipboard and paste them into an R Script file opened in RStudio. When cointegration does not exist, only short-run model is required. Short-run model includes only the first differences of both time-series. In this example the short-run effect is 0.0498 and not statistically significant.
# Estimating short-run model only named as "short"
short=dynlm(d(growth)~d(production))

# Presenting the results of the "short" model
modelsummary(list("Short-run model"=short),stars=TRUE,fmt=4)

# Short-run effect is the second coefficient from the "short" model
coef(short)[2]