12.2 Time Series

\[ y_t = \beta_0 + x_{t1}\beta_1 + x_{t2}\beta_2 + ... + x_{t(k-1)}\beta_{k-1} + \epsilon_t \]

Examples

  • Static Model

    • \(y_t=\beta_0 + x_1\beta_1 + x_2\beta_2 - x_3\beta_3 - \epsilon_t\)
  • Finite Distributed Lag model

    • \(y_t=\beta_0 + pe_t\delta_0 + pe_{t-1}\delta_1 +pe_{t-2}\delta_2 + \epsilon_t\)
    • Long Run Propensity (LRP) is \(LRP = \delta_0 + \delta_1 + \delta_2\)
  • Dynamic Model

    • \(GDP_t = \beta_0 + \beta_1GDP_{t-1} - \epsilon_t\)

Finite Sample Properties for Time Series:

  • A1-A3: OLS is unbiased
  • A1-A4: usual standard errors are consistent and Gauss-Markov Theorem holds (OLS is BLUE)
  • A1-A6, A6: Finite Sample Wald Test (t-test and F-test) are valid

A3 might not hold under time series setting

  • Spurious Time Trend - solvable
  • Strict vs Contemporaneous Exogeneity - not solvable

In time series data, there are many processes:

  • Autoregressive model of order p: AR(p)
  • Moving average model of order q: MA(q)
  • Autoregressive model of order p and moving average model of order q: ARMA(p,q)
  • Autoregressive conditional heteroskedasticity model of order p: ARCH(p)
  • Generalized Autoregressive conditional heteroskedasticity of orders p and q; GARCH(p.q)

12.2.1 Deterministic Time trend

Both the dependent and independent variables are trending over time

Spurious Time Series Regression

\[ y_t = \alpha_0 + t\alpha_1 + v_t \]

and x takes the form

\[ x_t = \lambda_0 + t\lambda_1 + u_t \]

  • \(\alpha_1 \neq 0\) and \(\lambda_1 \neq 0\)
  • \(v_t\) and \(u_t\) are independent
  • there is no relationship between \(y_t\) and \(x_t\)

If we estimate the regression,

\[ y_t = \beta_0 + x_t\beta_1 + \epsilon_t \]

so the true \(\beta_1=0\)

  • Inconsistent: \(plim(\hat{\beta}_1)=\frac{\alpha_1}{\lambda_1}\)
  • Invalid Inference: \(|t| \to^d \infty\) for \(H_0: \beta_1=0\), will always reject the null as \(n \to \infty\)
  • Uninformative \(R^2\): \(plim(R^2) = 1\) will be able to perfectly predict as \(n \to \infty\)

We can rewrite the equation as

\[ \begin{aligned} y_t &=\beta_0 + \beta_1x_t+\epsilon_t \\ \epsilon_t &= \alpha_1t + v_t \end{aligned} \]

where \(\beta_0 = \alpha_0\) and \(\beta_1=0\). Since \(x_t\) is a deterministic function of time, \(\epsilon_t\) is correlated with \(x_t\) and we have the usual omitted variable bias.
Even when \(y_t\) and \(x_t\) are related (\(\beta_1 \neq 0\)) but they are both trending over time, we still get spurious results with the simple regression on \(y_t\) on \(x_t\)

Solutions to Spurious Trend

  1. Include time trend \(t\) as an additional control

    • consistent parameter estimates and valid inference
  2. Detrend both dependent and independent variables and then regress the detrended outcome on detrended independent variables (i.e., regress residuals \(\hat{u}_t\) on residuals \(\hat{v}_t\))

    • Detrending is the same as partialing out in the Frisch-Waugh-Lovell Theorem

      • Could allow for non-linear time trends by including \(t\) \(t^2\), and \(\exp(t)\)
      • Allow for seasonality by including indicators for relevant “seasons” (quarters, months, weeks).

A3 does not hold under:

12.2.2 Feedback Effect

\[ y_t = \beta_0 + x_t\beta_1 + \epsilon_t \]

A3

\[ E(\epsilon_t|\mathbf{X})= E(\epsilon_t| x_1,x_2, ...,x_t,x_{t+1},...,x_T) \]

will not equal 0, because \(y_t\) will likely influence \(x_{t+1},..,x_T\)

  • A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)

12.2.3 Dynamic Specification

\[ y_t = \beta_0 + y_{t-1}\beta_1 + \epsilon_t \]

\[ E(\epsilon_t|\mathbf{X})= E(\epsilon_t| y_1,y_2, ...,y_t,y_{t+1},...,y_T) \]

will not equal 0, because \(y_t\) and \(\epsilon_t\) are inherently correlated

  • A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)
  • Dynamic Specification is not allowed under A3

12.2.4 Dynamically Complete

\[ y_t = \beta_0 + x_t\delta_0 + x_{t-1}\delta_1 + \epsilon_t \]

\[ E(\epsilon_t|\mathbf{X})= E(\epsilon_t| x_1,x_2, ...,x_t,x_{t+1},...,x_T) \]

will not equal 0, because if we did not include enough lags, \(x_{t-2}\) and \(\epsilon_t\) are correlated

  • A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)
  • Can be corrected by including more lags (but when stop? )

Without A3

then, we can

A3a in time series become

\[ A3a: E(\mathbf{x}_t'\epsilon_t)= 0 \]

only the regressors in this time period need to be independent from the error in this time period (Contemporaneous Exogeneity)

  • \(\epsilon_t\) can be correlated with \(...,x_{t-2},x_{t-1},x_{t+1}, x_{t+2},...\)
  • can have a dynamic specification \(y_t = \beta_0 + y_{t-1}\beta_1 + \epsilon_t\)

Deriving Large Sample Properties for Time Series

under A1, A2, A3a, and A5a, OLS estimator is consistent, and asymptotically normal

12.2.5 Highly Persistent Data

If \(y_t, \mathbf{x}_t\) are not weakly dependent stationary process

  • \(y_t\) and \(y_{t-h}\) are not almost independent for large h

  • A5a does not hold and OLS is not consistent and does not have a limiting distribution.

  • Example + Random Walk \(y_t = y_{t-1} + u_t\) + Random Walk with a drift: \(y_t = \alpha+ y_{t-1} + u_t\)

Solution First difference is a stationary process

\[ y_t - y_{t-1} = u_t \]

  • If \(u_t\) is a weakly dependent process (also called integrated of order 0) then \(y_t\) is said to be difference-stationary process (integrated of order 1)
  • For regression, if \(\{y_t, \mathbf{x}_t \}\) are random walks (integrated at order 1), can consistently estimate the first difference equation

\[ \begin{aligned} y_t - y_{t-1} &= (\mathbf{x}_t - \mathbf{x}_{t-1}\beta + \epsilon_t - \epsilon_{t-1}) \\ \Delta y_t &= \Delta \mathbf{x}\beta + \Delta u_t \end{aligned} \]

Unit Root Test

\[ y_t = \alpha + \alpha y_{t-1} + u_t \]

tests if \(\rho=1\) (integrated of order 1)

  • Under the null \(H_0: \rho = 1\), OLS is not consistent or asymptotically normal.
  • Under the alternative \(H_a: \rho < 1\), OLS is consistent and asymptotically normal.
  • usual t-test is not valid, will need to use the transformed equation to produce a valid test.

Dickey-Fuller Test \[ \Delta y_t= \alpha + \theta y_{t-1} + v_t \] where \(\theta = \rho -1\)

  • \(H_0: \theta = 0\) and \(H_a: \theta < 0\)
  • Under the null, \(\Delta y_t\) is weakly dependent but \(y_{t-1}\) is not.
  • Dickey and Fuller derived the non-normal asymptotic distribution. If you reject the null then \(y_t\) is not a random walk.

Concerns with the standard Dickey Fuller Test
1. Only considers a fairly simplistic dynamic relationship

\[ \Delta y_t = \alpha + \theta y_{t-1} + \gamma_1 \Delta_{t-1} + ..+ \gamma_p \Delta_{t-p} +v_t \]

  • with one additional lag, under the null \(\Delta_{y_t}\) is an AR(1) process and under the alternative \(y_t\) is an AR(2) process.
  • Solution: include lags of \(\Delta_{y_t}\) as controls.
  1. Does not allow for time trend \[ \Delta y_t = \alpha + \theta y_{t-1} + \delta t + v_t \]
  • allows \(y_t\) to have a quadratic relationship with \(t\)
  • Solution: include time trend (changes the critical values).

Adjusted Dickey-Fuller Test \[ \Delta y_t = \alpha + \theta y_{t-1} + \delta t + \gamma_1 \Delta y_{t-1} + ... + \gamma_p \Delta y_{t-p} + v_t \] where \(\theta = 1 - \rho\)

  • \(H_0: \theta_1 = 0\) and \(H_a: \theta_1 < 0\)
  • Under the null, \(\Delta y_t\) is weakly dependent but \(y_{t-1}\) is not
  • Critical values are different with the time trend, if you reject the null then \(y_t\) is not a random walk.
12.2.5.0.1 Newey West Standard Errors

If A4 does not hold, we can use Newey West Standard Errors (HAC - Heteroskedasticity Autocorrelation Consistent)

\[ \hat{B} = T^{-1} \sum_{t=1}^{T} e_t^2 \mathbf{x'_tx_t} + \sum_{h=1}^{g}(1-\frac{h}{g+1})T^{-1}\sum_{t=h+1}^{T} e_t e_{t-h}(\mathbf{x_t'x_{t-h}+ x_{t-h}'x_t}) \]

  • estimates the covariances up to a distance g part

  • downweights to insure \(\hat{B}\) is PSD

  • How to choose g:

    • For yearly data: \(g = 1\) or 2 is likely to account for most of the correlation
    • For quarterly or monthly data: g should be larger ($g = 4$ or 8 for quarterly and \(g = 12\) or 14 for monthly)
    • can also take integer part of \(4(T/100)^{2/9}\) or integer part of \(T^{1/4}\)

Testing for Serial Correlation

  1. Run OLS regression of \(y_t\) on \(\mathbf{x_t}\) and obtain residuals \(e_t\)

  2. Run OLS regression of \(e_t\) on \(\mathbf{x}_t, e_{t-1}\) and test whether coefficient on \(e_{t-1}\) is significant.

  3. Reject the null of no serial correlation if the coefficient is significant at the 5% level.

    • Test using heteroskedastic robust standard errors
    • can include \(e_{t-2},e_{t-3},..\) in step 2 to test for higher order serial correlation (t-test would now be an F-test of joint significance)