Chapter 20 REML Estimation of Variance Components

Consider the GLM: $y = X\beta + \epsilon, \text{ where } \epsilon \sim N(0, \Sigma)$ and $$\Sigma$$ is an $$n\times n$$ positive definite variance matrix which depends on an unknown parameter vector $$\gamma$$.

Residual/Restricted Maximum Likelihood (REML): an approach that produces unbiased estimators for these special cases and produces less biased estimates than ML in general.

REML Method
1. Find $$n - \text{rank}(X) = n - r$$ linearly independent vectors $$a_1, \ldots, a_{n-r}$$ such that $$a_i'X = 0'$$ for all $$i = 1, \ldots, n-r$$.

2. Find the MLE of $$\gamma$$ using $$w_1 \equiv a_1'y, \ldots, w_{n-r} \equiv a_{n-r}'y$$ as data. $A = [a_1, \ldots, a_{n-r}] \quad w = \begin{bmatrix} w_1 \\ \vdots \\ w_{n-r} \end{bmatrix} = \begin{bmatrix} a_1'y \\ \vdots \\ a_{n-r}'y \end{bmatrix} = A'y$

• If $$a'X = 0'$$, $$a'y$$ is known as error contrast. Thus, $$w_1, \ldots, w_{n-r}$$ comprise a set of $$n-r$$ error constrasts

• There exist $$n-r$$ linearly independent rows of $$I - P_X$$ to get $$a_1, \ldots, a_{n-r}$$.

• The error constrasts $$w_1, \ldots, w_{n-r}$$ will be a subset of the elements of the residual vector $$(I-P_X)y = y - \hat y$$

Note that $$w = A'y = A'\epsilon \sim N(0, A'\Sigma A)$$ and the distribution of $$w$$ depends only on $$\gamma$$.

The log likelihood function in this case is $\ell(\gamma\mid w) = -\frac{1}{2}\log |A'\Sigma A| - \frac{1}{2}w'(A'\Sigma A)^{-1}w - \frac{n-r}{2}\log(2\pi)$ An MLE of $$\gamma$$ can be found in the general case using numerical methods to obtain the REML estimate of $$\gamma$$.

In 611, we show that every set of $$n-r$$ linearly independent error contrasts yields the same REML estimator of $$\gamma$$.

In any case, once a REML estimate of $$\gamma$$ has been obtained, the BLUE of an estimate $$C\beta$$ can be approximated by $$C\hat\beta_{\hat\Sigma} = C(X'\hat \Sigma^{-1} X)^{-}X' \hat\Sigma^{-1} y$$, where $$\hat\Sigma$$ is $$\Sigma$$ with $$\hat\gamma$$ (the REML estimate of $$\gamma$$) in place of $$\gamma$$.

Example in Chapter 19
lme(SeedlingWeight ̃Genotype,random= ̃1|Tray,data=d)

Linear mixed-effects model fit by REML
Data: d
Log-restricted-likelihood: -123.5705
Fixed: SeedlingWeight  ̃ Genotype
(Intercept)   Genotype2
15.288838   -3.550201
Random effects:
Formula:  ̃1 | Tray
(Intercept) Residual
StdDev:    3.414856  1.88223
Number of Observations: 56
Number of Groups: 8

lmer(SeedlingWeight ̃Genotype+(1|Tray),data=d)

Linear mixed model fit by REML [’lmerMod’]
Formula: SeedlingWeight  ̃ Genotype + (1 | Tray)
Data: d
REML criterion at convergence: 247.1411
Random effects:
Groups   Name        Std.Dev.
Tray   (Intercept)   3.415
Residual             1.882
Number of obs: 56, groups: Tray, 8
Fixed Effects:
(Intercept)    Genotype2
15.29        -3.55

In SAS, REML is also the default for SAS proc mixed.