8.3 Inference in Linear Mixed Models

8.3.1 Inference for Fixed Effects ( $\beta$ )

The goal is to test hypotheses about the fixed effects parameters $\beta$ using various statistical tests:

8.3.1.1 Wald Test

The Wald test assesses whether certain linear combinations of fixed effects are equal to specified values.

Given:

$\hat{\beta}(\theta) = \left( \mathbf{X}' \mathbf{V}^{-1}(\theta) \mathbf{X} \right)^{-1} \mathbf{X}' \mathbf{V}^{-1}(\theta) \mathbf{Y},$

and its variance:

$\text{Var}(\hat{\beta}(\theta)) = \left( \mathbf{X}' \mathbf{V}^{-1}(\theta) \mathbf{X} \right)^{-1}.$

In practice, we substitute $\hat{\theta}$ (the estimate of $\theta$ ) to obtain:

Hypotheses:

$H_0: \mathbf{A \beta} = \mathbf{d}$

where:
- $\mathbf{A}$ is a contrast matrix specifying linear combinations of $\beta$ .
- $\mathbf{d}$ is a constant vector representing the null hypothesis values.
Wald Test Statistic:

$W = (\mathbf{A} \hat{\beta} - \mathbf{d})' \left[ \mathbf{A} \left( \mathbf{X}' \hat{\mathbf{V}}^{-1} \mathbf{X} \right)^{-1} \mathbf{A}' \right]^{-1} (\mathbf{A} \hat{\beta} - \mathbf{d}).$
Distribution under $H_0$ :

$W \sim \chi^2_{\text{rank}(\mathbf{A})}.$

Caution with Wald Test:

Underestimation of Variance:
The Wald test ignores the variability from estimating $\hat{\theta}$ , leading to underestimated standard errors and potentially inflated Type I error rates.
Small Sample Issues:
Less reliable in small samples or when variance components are near boundary values (e.g., variances close to zero).

8.3.1.2 F-Test

An alternative to the Wald test, the F-test adjusts for the estimation of $\sigma^2$ and provides better performance in small samples.

Assume:

$\text{Var}(\mathbf{Y}) = \sigma^2 \mathbf{V}(\theta).$

The F-statistic is:

$F^* = \frac{(\mathbf{A} \hat{\beta} - \mathbf{d})' \left[ \mathbf{A} \left( \mathbf{X}' \hat{\mathbf{V}}^{-1} \mathbf{X} \right)^{-1} \mathbf{A}' \right]^{-1} (\mathbf{A} \hat{\beta} - \mathbf{d})}{\hat{\sigma}^2 \text{rank}(\mathbf{A})}.$

Distribution under $H_0$ :

$F^* \sim F_{\text{rank}(\mathbf{A}), \text{df}_{\text{denominator}}}.$
Approximating Denominator Degrees of Freedom:
- Satterthwaite approximation
- Kenward-Roger approximation (provides bias-corrected standard errors)

F-Test Advantages:

More accurate in small samples compared to the Wald test.
Adjusts for variance estimation, reducing bias in hypothesis testing.

Wald Test vs. F-Test:

Comparison of Wald Test and F-Test Under Common Evaluation Criteria
Criterion	Wald Test	F-Test
Small Sample Performance	Poor (can inflate Type I error)	Better control of Type I error
Variance Estimation	Ignores variability in $\hat{\theta}$	Adjusts using $\hat{\sigma}^2$
Reduction to t-test	Yes (for single $\beta$ )	Yes (when rank( $\mathbf{A}$ ) = 1)

8.3.1.3 Likelihood Ratio Test

The Likelihood Ratio Test (LRT) compares the fit of nested models:

Null Hypothesis:

$H_0: \beta \in \Theta_{\beta,0}$

where $\Theta_{\beta,0}$ is a subset of the full parameter space $\Theta_{\beta}$ .
Test Statistic:

$-2 \log \lambda = -2 \log \left( \frac{\hat{L}_{ML,0}}{\hat{L}_{ML}} \right),$

where:
- $\hat{L}_{ML,0}$ = Maximized likelihood under $H_0$ (restricted model)
- $\hat{L}_{ML}$ = Maximized likelihood under the alternative (full model)
Distribution under $H_0$ :

$-2 \log \lambda \sim \chi^2_{df}$

where $df = \dim(\Theta_{\beta}) - \dim(\Theta_{\beta,0})$ (the difference in the number of parameters).

Important Notes:

LRT is applicable only for ML estimates (not REML) when comparing models with different fixed effects.
REML-based LRT can be used for comparing models that differ in random effects (variance components), but not fixed effects.

8.3.2 Inference for Variance Components ( $\theta$ )

For ML and REML estimators:

$\hat{\theta} \sim N(\theta, I(\theta)^{-1}),$

where $I(\theta)$ is the Fisher Information Matrix.

This normal approximation holds well for large samples, enabling Wald-type tests and confidence intervals.

8.3.2.1 Wald Test for Variance Components

The Wald test for variance components follows the same structure as for fixed effects:

Test Statistic:

$W = \frac{(\hat{\theta} - \theta_0)^2}{\widehat{\text{Var}}(\hat{\theta})}.$
Distribution under $H_0$ :

$W \sim \chi^2_1.$

Limitations of Wald Test for Variance Components:

Boundary Issues: The normal approximation fails when the true variance component is near zero (boundary of the parameter space).
Less reliable for variance parameters than for covariance parameters.

8.3.2.2 Likelihood Ratio Test for Variance Components

LRT can also be applied to variance components:

Test Statistic:

$-2 \log \lambda = -2 \log \left( \frac{\hat{L}_{REML,0}}{\hat{L}_{REML}} \right).$
Distribution under $H_0$ :
- Not always $\chi^2$ -distributed when variance components are on the boundary (e.g., testing if $\sigma^2 = 0$ ).
- May require mixture distributions or adjusted critical values.

Comparison of Statistical Tests for Fixed and Random Effects in Linear Mixed Models
Test	Best For	Strengths	Limitations
Wald Test	Fixed effects ( $\beta$ )	Simple, widely used	Underestimates variance, biased in small samples
F-Test	Fixed effects ( $\beta$ )	Better in small samples, adjusts df	Requires approximation for degrees of freedom
LRT (ML)	Fixed effects, nested models	Powerful, widely used	Not valid for REML with fixed effects
LRT (REML)	Variance components	Robust for random effects	Boundary issues when variances are near zero
Wald (Variance)	Variance components ( $\theta$ )	Simple extension of Wald test	Fails near parameter space boundaries