14.5 Autocorrelation Tests

Autocorrelation (also known as serial correlation) occurs when the error terms (ϵt) in a regression model are correlated across observations, violating the assumption of independence in the classical linear regression model. This issue is particularly common in time-series data, where observations are ordered over time.

Consequences of Autocorrelation:

  • OLS estimators remain unbiased but become inefficient, meaning they do not have the minimum variance among all linear unbiased estimators.
  • Standard errors are biased, leading to unreliable hypothesis tests (e.g., t-tests and F-tests).
  • Potential underestimation of standard errors, increasing the risk of Type I errors (false positives).

14.5.1 Durbin–Watson Test

The Durbin–Watson (DW) Test is the most widely used test for detecting first-order autocorrelation, where the current error term is correlated with the previous one:

ϵt=ρϵt1+ut

Where:

  • ρ is the autocorrelation coefficient,

  • ut is a white noise error term.


Hypotheses

  • Null Hypothesis (H0): No first-order autocorrelation (ρ=0).
  • Alternative Hypothesis (H1): First-order autocorrelation exists (ρ0).

Durbin–Watson Test Statistic

The DW statistic is calculated as:

DW=nt=2(ˆϵtˆϵt1)2nt=1ˆϵ2t

Where:

  • ˆϵt are the residuals from the regression,

  • n is the number of observations.


Decision Rule

  • The DW statistic ranges from 0 to 4:
    • DW ≈ 2: No autocorrelation.
    • DW < 2: Positive autocorrelation.
    • DW > 2: Negative autocorrelation.

For more precise interpretation:

  • Use Durbin–Watson critical values (dL and dU) to form decision boundaries.

  • If the test statistic falls into an inconclusive range, consider alternative tests like the Breusch–Godfrey test.


Advantages and Limitations

  • Advantage: Simple to compute; specifically designed for detecting first-order autocorrelation.
  • Limitation: Inconclusive in some cases; invalid when lagged dependent variables are included in the model.

14.5.2 Breusch–Godfrey Test

The Breusch–Godfrey (BG) Test is a more general approach that can detect higher-order autocorrelation (e.g., second-order, third-order) and is valid even when lagged dependent variables are included in the model (Breusch 1978; Godfrey 1978).


Hypotheses

  • Null Hypothesis (H0): No autocorrelation of any order (up to lag p).
  • Alternative Hypothesis (H1): Autocorrelation exists at some lag(s).

Procedure

  1. Estimate the original regression model and obtain residuals ˆϵt:

    yt=β0+β1x1t++βkxkt+ϵt

  2. Run an auxiliary regression by regressing the residuals on the original regressors plus p lagged residuals:

    ˆϵt=α0+α1x1t++αkxkt+ρ1ˆϵt1++ρpˆϵtp+ut

  3. Calculate the test statistic:

    BG=nR2aux

    Where:

    • n is the sample size,
    • R2aux is the R2 from the auxiliary regression.

Decision Rule:

  • Under H0, the BG statistic follows a chi-squared distribution with p degrees of freedom:

    BGχ2p

  • Reject H0 if the statistic exceeds the critical chi-squared value, indicating the presence of autocorrelation.


Advantages and Limitations

  • Advantage: Can detect higher-order autocorrelation; valid with lagged dependent variables.
  • Limitation: More computationally intensive than the Durbin–Watson test.

14.5.3 Ljung–Box Test (or Box–Pierce Test)

The Ljung–Box Test is a portmanteau test designed to detect autocorrelation at multiple lags simultaneously (Box and Pierce 1970; Ljung and Box 1978). It is commonly used in time-series analysis to check residual autocorrelation after model estimation (e.g., in ARIMA models).


Hypotheses

  • Null Hypothesis (H0): No autocorrelation up to lag h.
  • Alternative Hypothesis (H1): Autocorrelation exists at one or more lags.

Ljung–Box Test Statistic

The Ljung–Box statistic is calculated as:

Q=n(n+2)hk=1ˆρ2knk

Where:

  • n = Sample size,
  • h = Number of lags tested,
  • ˆρk = Sample autocorrelation at lag k.

Decision Rule

  • Under H0, the Q statistic follows a chi-squared distribution with h degrees of freedom:

    Qχ2h

  • Reject H0 if Q exceeds the critical value, indicating significant autocorrelation.


Advantages and Limitations

  • Advantage: Detects autocorrelation across multiple lags simultaneously.
  • Limitation: Less powerful for detecting specific lag structures; sensitive to model misspecification.

14.5.4 Runs Test

The Runs Test is a non-parametric test that examines the randomness of residuals. It is based on the number of runs—sequences of consecutive residuals with the same sign.


Hypotheses

  • Null Hypothesis (H0): Residuals are randomly distributed (no autocorrelation).
  • Alternative Hypothesis (H1): Residuals exhibit non-random patterns (indicating autocorrelation).

Procedure

  1. Classify residuals as positive or negative.

  2. Count the number of runs: A run is a sequence of consecutive positive or negative residuals.

  3. Calculate the expected number of runs under randomness:

    E(R)=2n+nn+1

    Where:

    • n+ = Number of positive residuals,
    • n = Number of negative residuals,
    • n=n++n.
  4. Compute the test statistic (Z-score):

    Z=RE(R)Var(R)

    Where Var(R) is the variance of the number of runs under the null hypothesis.


Decision Rule

  • Under H0, the Z-statistic follows a standard normal distribution:

    ZN(0,1)

  • Reject H0 if |Z| exceeds the critical value from the standard normal distribution.


Advantages and Limitations

  • Advantage: Non-parametric; does not assume normality or linearity.
  • Limitation: Less powerful than parametric tests; primarily useful as a supplementary diagnostic.

14.5.5 Summary of Autocorrelation Tests

Test Type Key Statistic Detects When to Use
Durbin–Watson Parametric DW First-order autocorrelation Simple linear models without lagged dependent variables
Breusch–Godfrey Parametric (general) χ2 Higher-order autocorrelation Models with lagged dependent variables
Ljung–Box Portmanteau (global test) χ2 Autocorrelation at multiple lags Time-series models (e.g., ARIMA)
Runs Test Non-parametric Z-statistic Non-random patterns in residuals Supplementary diagnostic for randomness

Detecting autocorrelation is crucial for ensuring the efficiency and reliability of regression models, especially in time-series analysis. While the Durbin–Watson Test is suitable for detecting first-order autocorrelation, the Breusch–Godfrey Test and Ljung–Box Test offer more flexibility for higher-order and multi-lag dependencies. Non-parametric tests like the Runs Test serve as useful supplementary diagnostics.

# Install and load necessary libraries
# install.packages("lmtest")  # For Durbin–Watson and Breusch–Godfrey Tests
# install.packages("tseries") # For Runs Test
# install.packages("forecast")# For Ljung–Box Test

library(lmtest)
library(tseries)
library(forecast)

# Simulated time-series dataset
set.seed(123)
n <- 100
time <- 1:n
x1 <- rnorm(n, mean = 50, sd = 10)
x2 <- rnorm(n, mean = 30, sd = 5)

# Introducing autocorrelation in errors
epsilon <- arima.sim(model = list(ar = 0.6), n = n) 
# AR(1) process with ρ = 0.6
y <- 5 + 0.4 * x1 - 0.3 * x2 + epsilon

# Original regression model
model <- lm(y ~ x1 + x2)

# ----------------------------------------------------------------------
# 1. Durbin–Watson Test
# ----------------------------------------------------------------------
# Null Hypothesis: No first-order autocorrelation
dw_test <- dwtest(model)
print(dw_test)
#> 
#>  Durbin-Watson test
#> 
#> data:  model
#> DW = 0.77291, p-value = 3.323e-10
#> alternative hypothesis: true autocorrelation is greater than 0

# ----------------------------------------------------------------------
# 2. Breusch–Godfrey Test
# ----------------------------------------------------------------------
# Null Hypothesis: No autocorrelation up to lag 2
bg_test <- bgtest(model, order = 2)  # Testing for autocorrelation up to lag 2
print(bg_test)
#> 
#>  Breusch-Godfrey test for serial correlation of order up to 2
#> 
#> data:  model
#> LM test = 40.314, df = 2, p-value = 1.762e-09

# ----------------------------------------------------------------------
# 3. Ljung–Box Test
# ----------------------------------------------------------------------
# Null Hypothesis: No autocorrelation up to lag 10
ljung_box_test <- Box.test(residuals(model), lag = 10, type = "Ljung-Box")
print(ljung_box_test)
#> 
#>  Box-Ljung test
#> 
#> data:  residuals(model)
#> X-squared = 50.123, df = 10, p-value = 2.534e-07

# ----------------------------------------------------------------------
# 4. Runs Test (Non-parametric)
# ----------------------------------------------------------------------
# Null Hypothesis: Residuals are randomly distributed
runs_test <- runs.test(as.factor(sign(residuals(model))))
print(runs_test)
#> 
#>  Runs Test
#> 
#> data:  as.factor(sign(residuals(model)))
#> Standard Normal = -4.2214, p-value = 2.428e-05
#> alternative hypothesis: two.sided

Interpretation of the Results

  1. Durbin–Watson Test

    • Null Hypothesis (H0): No first-order autocorrelation (ρ=0).

    • Alternative Hypothesis (H1): First-order autocorrelation exists (ρ0).

    • Decision Rule:

      • Reject H0 if DW <1.5 (positive autocorrelation) or DW >2.5 (negative autocorrelation).
      • Fail to reject H0 if DW 2, suggesting no significant autocorrelation.
  2. Breusch–Godfrey Test

    • Null Hypothesis (H0): No autocorrelation up to lag p (here, p=2).

    • Alternative Hypothesis (H1): Autocorrelation exists at one or more lags.

    • Decision Rule:

      • Reject H0 if p-value <0.05, indicating significant autocorrelation.
      • Fail to reject H0 if p-value 0.05, suggesting no evidence of autocorrelation.
  3. Ljung–Box Test

    • Null Hypothesis (H0): No autocorrelation up to lag h (here, h=10).

    • Alternative Hypothesis (H1): Autocorrelation exists at one or more lags.

    • Decision Rule:

      • Reject H0 if p-value <0.05, indicating significant autocorrelation.
      • Fail to reject H0 if p-value 0.05, suggesting no evidence of autocorrelation.
  4. Runs Test (Non-parametric)

    • Null Hypothesis (H0): Residuals are randomly distributed (no autocorrelation).

    • Alternative Hypothesis (H1): Residuals exhibit non-random patterns (indicating autocorrelation).

    • Decision Rule:

      • Reject H0 if p-value <0.05, indicating non-randomness and potential autocorrelation.
      • Fail to reject H0 if p-value 0.05, suggesting randomness in residuals.

References

Box, George EP, and David A Pierce. 1970. “Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models.” Journal of the American Statistical Association 65 (332): 1509–26.
Breusch, Trevor S. 1978. “Testing for Autocorrelation in Dynamic Linear Models.” Australian Economic Papers 17 (31): 334–55.
Godfrey, Leslie G. 1978. “Testing Against General Autoregressive and Moving Average Error Models When the Regressors Include Lagged Dependent Variables.” Econometrica: Journal of the Econometric Society, 1293–1301.
Ljung, Greta M, and George EP Box. 1978. “On a Measure of Lack of Fit in Time Series Models.” Biometrika 65 (2): 297–303.