Chapter 20 REML Estimation of Variance Components

Consider the GLM: $y = X\beta + \epsilon, \text{ where } \epsilon \sim N(0, \Sigma)$ and $\Sigma$ is an $n\times n$ positive definite variance matrix which depends on an unknown parameter vector $\gamma$ .

Residual/Restricted Maximum Likelihood (REML): an approach that produces unbiased estimators for these special cases and produces less biased estimates than ML in general.

REML Method

Find $n - \text{rank}(X) = n - r$ linearly independent vectors $a_1, \ldots, a_{n-r}$ such that $a_i'X = 0'$ for all $i = 1, \ldots, n-r$ .
Find the MLE of $\gamma$ using $w_1 \equiv a_1'y, \ldots, w_{n-r} \equiv a_{n-r}'y$ as data. $A = [a_1, \ldots, a_{n-r}] \quad w = \begin{bmatrix} w_1 \\ \vdots \\ w_{n-r} \end{bmatrix} = \begin{bmatrix} a_1'y \\ \vdots \\ a_{n-r}'y \end{bmatrix} = A'y$
- If $a'X = 0'$ , $a'y$ is known as error contrast. Thus, $w_1, \ldots, w_{n-r}$ comprise a set of $n-r$ error constrasts
- There exist $n-r$ linearly independent rows of $I - P_X$ to get $a_1, \ldots, a_{n-r}$ .
- The error constrasts $w_1, \ldots, w_{n-r}$ will be a subset of the elements of the residual vector $(I-P_X)y = y - \hat y$

Note that $w = A'y = A'\epsilon \sim N(0, A'\Sigma A)$ and the distribution of $w$ depends only on $\gamma$ .

The log likelihood function in this case is $\ell(\gamma\mid w) = -\frac{1}{2}\log |A'\Sigma A| - \frac{1}{2}w'(A'\Sigma A)^{-1}w - \frac{n-r}{2}\log(2\pi)$ An MLE of $\gamma$ can be found in the general case using numerical methods to obtain the REML estimate of $\gamma$ .

In 611, we show that every set of $n-r$ linearly independent error contrasts yields the same REML estimator of $\gamma$ .

In any case, once a REML estimate of $\gamma$ has been obtained, the BLUE of an estimate $C\beta$ can be approximated by $C\hat\beta_{\hat\Sigma} = C(X'\hat \Sigma^{-1} X)^{-}X' \hat\Sigma^{-1} y$ , where $\hat\Sigma$ is $\Sigma$ with $\hat\gamma$ (the REML estimate of $\gamma$ ) in place of $\gamma$ .

Example in Chapter 19

lme(SeedlingWeight ̃Genotype,random= ̃1|Tray,data=d)

Linear mixed-effects model fit by REML
  Data: d
  Log-restricted-likelihood: -123.5705
  Fixed: SeedlingWeight  ̃ Genotype
(Intercept)   Genotype2
  15.288838   -3.550201
Random effects:
 Formula:  ̃1 | Tray
        (Intercept) Residual
StdDev:    3.414856  1.88223
Number of Observations: 56
Number of Groups: 8


lmer(SeedlingWeight ̃Genotype+(1|Tray),data=d)

Linear mixed model fit by REML [’lmerMod’]
Formula: SeedlingWeight  ̃ Genotype + (1 | Tray)
   Data: d
REML criterion at convergence: 247.1411
Random effects:
 Groups   Name        Std.Dev.
 Tray   (Intercept)   3.415
 Residual             1.882
Number of obs: 56, groups: Tray, 8
Fixed Effects:
(Intercept)    Genotype2
      15.29        -3.55

In SAS, REML is also the default for SAS proc mixed.