14.5 Autocorrelation Tests

Autocorrelation (also known as serial correlation) occurs when the error terms (\(\epsilon_t\)) in a regression model are correlated across observations, violating the assumption of independence in the classical linear regression model. This issue is particularly common in time-series data, where observations are ordered over time.

Consequences of Autocorrelation:

OLS estimators remain unbiased but become inefficient, meaning they do not have the minimum variance among all linear unbiased estimators.
Standard errors are biased, leading to unreliable hypothesis tests (e.g., \(t\)-tests and \(F\)-tests).
Potential underestimation of standard errors, increasing the risk of Type I errors (false positives).

14.5.1 Durbin–Watson Test

The Durbin–Watson (DW) Test is the most widely used test for detecting first-order autocorrelation, where the current error term is correlated with the previous one:

\[ \epsilon_t = \rho \, \epsilon_{t-1} + u_t \]

Where:

\(\rho\) is the autocorrelation coefficient,
\(u_t\) is a white noise error term.

Hypotheses

Null Hypothesis (\(H_0\)): No first-order autocorrelation (\(\rho = 0\)).
Alternative Hypothesis (\(H_1\)): First-order autocorrelation exists (\(\rho \neq 0\)).

Durbin–Watson Test Statistic

The DW statistic is calculated as:

\[ DW = \frac{\sum_{t=2}^{n} (\hat{\epsilon}_t - \hat{\epsilon}_{t-1})^2}{\sum_{t=1}^{n} \hat{\epsilon}_t^2} \]

Where:

\(\hat{\epsilon}_t\) are the residuals from the regression,
\(n\) is the number of observations.

Decision Rule

The DW statistic ranges from 0 to 4:
- DW ≈ 2: No autocorrelation.
- DW < 2: Positive autocorrelation.
- DW > 2: Negative autocorrelation.

For more precise interpretation:

Use Durbin–Watson critical values (\(d_L\) and \(d_U\)) to form decision boundaries.
If the test statistic falls into an inconclusive range, consider alternative tests like the Breusch–Godfrey test.

Advantages and Limitations

Advantage: Simple to compute; specifically designed for detecting first-order autocorrelation.
Limitation: Inconclusive in some cases; invalid when lagged dependent variables are included in the model.

14.5.2 Breusch–Godfrey Test

The Breusch–Godfrey (BG) Test is a more general approach that can detect higher-order autocorrelation (e.g., second-order, third-order) and is valid even when lagged dependent variables are included in the model (Breusch 1978; Godfrey 1978).

Hypotheses

Null Hypothesis (\(H_0\)): No autocorrelation of any order (up to lag \(p\)).
Alternative Hypothesis (\(H_1\)): Autocorrelation exists at some lag(s).

Procedure

Estimate the original regression model and obtain residuals \(\hat{\epsilon}_t\):

\[ y_t = \beta_0 + \beta_1 x_{1t} + \dots + \beta_k x_{kt} + \epsilon_t \]
Run an auxiliary regression by regressing the residuals on the original regressors plus \(p\) lagged residuals:

\[ \hat{\epsilon}_t = \alpha_0 + \alpha_1 x_{1t} + \dots + \alpha_k x_{kt} + \rho_1 \hat{\epsilon}_{t-1} + \dots + \rho_p \hat{\epsilon}_{t-p} + u_t \]
Calculate the test statistic:

\[ \text{BG} = n \cdot R^2_{\text{aux}} \]

Where:
- \(n\) is the sample size,
- \(R^2_{\text{aux}}\) is the \(R^2\) from the auxiliary regression.

Decision Rule:

Under \(H_0\), the BG statistic follows a chi-squared distribution with \(p\) degrees of freedom:

\[ \text{BG} \sim \chi^2_p \]
Reject \(H_0\) if the statistic exceeds the critical chi-squared value, indicating the presence of autocorrelation.

Advantages and Limitations

Advantage: Can detect higher-order autocorrelation; valid with lagged dependent variables.
Limitation: More computationally intensive than the Durbin–Watson test.

14.5.3 Ljung–Box Test (or Box–Pierce Test)

The Ljung–Box Test is a portmanteau test designed to detect autocorrelation at multiple lags simultaneously (Box and Pierce 1970; Ljung and Box 1978). It is commonly used in time-series analysis to check residual autocorrelation after model estimation (e.g., in ARIMA models).

Hypotheses

Null Hypothesis (\(H_0\)): No autocorrelation up to lag \(h\).
Alternative Hypothesis (\(H_1\)): Autocorrelation exists at one or more lags.

Ljung–Box Test Statistic

The Ljung–Box statistic is calculated as:

\[ Q = n(n + 2) \sum_{k=1}^{h} \frac{\hat{\rho}_k^2}{n - k} \]

Where:

\(n\) = Sample size,
\(h\) = Number of lags tested,
\(\hat{\rho}_k\) = Sample autocorrelation at lag \(k\).

Decision Rule

Under \(H_0\), the \(Q\) statistic follows a chi-squared distribution with \(h\) degrees of freedom:

\[ Q \sim \chi^2_h \]
Reject \(H_0\) if \(Q\) exceeds the critical value, indicating significant autocorrelation.

Advantages and Limitations

Advantage: Detects autocorrelation across multiple lags simultaneously.
Limitation: Less powerful for detecting specific lag structures; sensitive to model misspecification.

14.5.4 Runs Test

The Runs Test is a non-parametric test that examines the randomness of residuals. It is based on the number of runs—sequences of consecutive residuals with the same sign.

Hypotheses

Null Hypothesis (\(H_0\)): Residuals are randomly distributed (no autocorrelation).
Alternative Hypothesis (\(H_1\)): Residuals exhibit non-random patterns (indicating autocorrelation).

Procedure

Classify residuals as positive or negative.
Count the number of runs: A run is a sequence of consecutive positive or negative residuals.
Calculate the expected number of runs under randomness:

\[ E(R) = \frac{2 n_+ n_-}{n} + 1 \]

Where:
- \(n_+\) = Number of positive residuals,
- \(n_-\) = Number of negative residuals,
- \(n = n_+ + n_-\).
Compute the test statistic (Z-score):

\[ Z = \frac{R - E(R)}{\sqrt{\text{Var}(R)}} \]

Where \(\text{Var}(R)\) is the variance of the number of runs under the null hypothesis.

Decision Rule

Under \(H_0\), the \(Z\)-statistic follows a standard normal distribution:

\[ Z \sim N(0, 1) \]
Reject \(H_0\) if \(|Z|\) exceeds the critical value from the standard normal distribution.

Advantages and Limitations

Advantage: Non-parametric; does not assume normality or linearity.
Limitation: Less powerful than parametric tests; primarily useful as a supplementary diagnostic.

14.5.5 Summary of Autocorrelation Tests

Summary of Autocorrelation Tests
Test	Type	Key Statistic	Detects	When to Use
Durbin–Watson	Parametric	\(DW\)	First-order autocorrelation	Simple linear models without lagged dependent variables
Breusch–Godfrey	Parametric (general)	\(\chi^2\)	Higher-order autocorrelation	Models with lagged dependent variables
Ljung–Box	Portmanteau (global test)	\(\chi^2\)	Autocorrelation at multiple lags	Time-series models (e.g., ARIMA)
Runs Test	Non-parametric	\(Z\)-statistic	Non-random patterns in residuals	Supplementary diagnostic for randomness

Detecting autocorrelation is crucial for ensuring the efficiency and reliability of regression models, especially in time-series analysis. While the Durbin–Watson Test is suitable for detecting first-order autocorrelation, the Breusch–Godfrey Test and Ljung–Box Test offer more flexibility for higher-order and multi-lag dependencies. Non-parametric tests like the Runs Test serve as useful supplementary diagnostics.

# Install and load necessary libraries
# install.packages("lmtest")  # For Durbin–Watson and Breusch–Godfrey Tests
# install.packages("tseries") # For Runs Test
# install.packages("forecast")# For Ljung–Box Test

library(lmtest)
library(tseries)
library(forecast)

# Simulated time-series dataset
set.seed(123)
n <- 100
time <- 1:n
x1 <- rnorm(n, mean = 50, sd = 10)
x2 <- rnorm(n, mean = 30, sd = 5)

# Introducing autocorrelation in errors
epsilon <- arima.sim(model = list(ar = 0.6), n = n) 
# AR(1) process with ρ = 0.6
y <- 5 + 0.4 * x1 - 0.3 * x2 + epsilon

# Original regression model
model <- lm(y ~ x1 + x2)

# ----------------------------------------------------------------------
# 1. Durbin–Watson Test
# ----------------------------------------------------------------------
# Null Hypothesis: No first-order autocorrelation
dw_test <- dwtest(model)
print(dw_test)
#> 
#>  Durbin-Watson test
#> 
#> data:  model
#> DW = 0.77291, p-value = 3.323e-10
#> alternative hypothesis: true autocorrelation is greater than 0

# ----------------------------------------------------------------------
# 2. Breusch–Godfrey Test
# ----------------------------------------------------------------------
# Null Hypothesis: No autocorrelation up to lag 2
bg_test <- bgtest(model, order = 2)  # Testing for autocorrelation up to lag 2
print(bg_test)
#> 
#>  Breusch-Godfrey test for serial correlation of order up to 2
#> 
#> data:  model
#> LM test = 40.314, df = 2, p-value = 1.762e-09

# ----------------------------------------------------------------------
# 3. Ljung–Box Test
# ----------------------------------------------------------------------
# Null Hypothesis: No autocorrelation up to lag 10
ljung_box_test <- Box.test(residuals(model), lag = 10, type = "Ljung-Box")
print(ljung_box_test)
#> 
#>  Box-Ljung test
#> 
#> data:  residuals(model)
#> X-squared = 50.123, df = 10, p-value = 2.534e-07

# ----------------------------------------------------------------------
# 4. Runs Test (Non-parametric)
# ----------------------------------------------------------------------
# Null Hypothesis: Residuals are randomly distributed
runs_test <- runs.test(as.factor(sign(residuals(model))))
print(runs_test)
#> 
#>  Runs Test
#> 
#> data:  as.factor(sign(residuals(model)))
#> Standard Normal = -4.2214, p-value = 2.428e-05
#> alternative hypothesis: two.sided

Interpretation of the Results

Durbin–Watson Test
- Null Hypothesis (\(H_0\)): No first-order autocorrelation (\(\rho = 0\)).
- Alternative Hypothesis (\(H_1\)): First-order autocorrelation exists (\(\rho \neq 0\)).
- Decision Rule:
  - Reject \(H_0\) if DW \(< 1.5\) (positive autocorrelation) or DW \(> 2.5\) (negative autocorrelation).
  - Fail to reject \(H_0\) if DW \(\approx 2\), suggesting no significant autocorrelation.
Breusch–Godfrey Test
- Null Hypothesis (\(H_0\)): No autocorrelation up to lag \(p\) (here, \(p = 2\)).
- Alternative Hypothesis (\(H_1\)): Autocorrelation exists at one or more lags.
- Decision Rule:
  - Reject \(H_0\) if p-value \(< 0.05\), indicating significant autocorrelation.
  - Fail to reject \(H_0\) if p-value \(\ge 0.05\), suggesting no evidence of autocorrelation.
Ljung–Box Test
- Null Hypothesis (\(H_0\)): No autocorrelation up to lag \(h\) (here, \(h = 10\)).
- Alternative Hypothesis (\(H_1\)): Autocorrelation exists at one or more lags.
- Decision Rule:
  - Reject \(H_0\) if p-value \(< 0.05\), indicating significant autocorrelation.
  - Fail to reject \(H_0\) if p-value \(\ge 0.05\), suggesting no evidence of autocorrelation.
Runs Test (Non-parametric)
- Null Hypothesis (\(H_0\)): Residuals are randomly distributed (no autocorrelation).
- Alternative Hypothesis (\(H_1\)): Residuals exhibit non-random patterns (indicating autocorrelation).
- Decision Rule:
  - Reject \(H_0\) if p-value \(< 0.05\), indicating non-randomness and potential autocorrelation.
  - Fail to reject \(H_0\) if p-value \(\ge 0.05\), suggesting randomness in residuals.

References

Box, George EP, and David A Pierce. 1970. “Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models.” Journal of the American Statistical Association 65 (332): 1509–26.

Breusch, Trevor S. 1978. “Testing for Autocorrelation in Dynamic Linear Models.” Australian Economic Papers 17 (31): 334–55.

Godfrey, Leslie G. 1978. “Testing Against General Autoregressive and Moving Average Error Models When the Regressors Include Lagged Dependent Variables.” Econometrica: Journal of the Econometric Society, 1293–1301.

Ljung, Greta M, and George EP Box. 1978. “On a Measure of Lack of Fit in Time Series Models.” Biometrika 65 (2): 297–303.