Chapter 20 REML Estimation of Variance Components
Consider the GLM: \[ y = X\beta + \epsilon, \text{ where } \epsilon \sim N(0, \Sigma) \] and \(\Sigma\) is an \(n\times n\) positive definite variance matrix which depends on an unknown parameter vector \(\gamma\).
Residual/Restricted Maximum Likelihood (REML): an approach that produces unbiased estimators for these special cases and produces less biased estimates than ML in general.
REML Method
Find \(n - \text{rank}(X) = n - r\) linearly independent vectors \(a_1, \ldots, a_{n-r}\) such that \(a_i'X = 0'\) for all \(i = 1, \ldots, n-r\).
Find the MLE of \(\gamma\) using \(w_1 \equiv a_1'y, \ldots, w_{n-r} \equiv a_{n-r}'y\) as data. \[ A = [a_1, \ldots, a_{n-r}] \quad w = \begin{bmatrix} w_1 \\ \vdots \\ w_{n-r} \end{bmatrix} = \begin{bmatrix} a_1'y \\ \vdots \\ a_{n-r}'y \end{bmatrix} = A'y \]
If \(a'X = 0'\), \(a'y\) is known as error contrast. Thus, \(w_1, \ldots, w_{n-r}\) comprise a set of \(n-r\) error constrasts
There exist \(n-r\) linearly independent rows of \(I - P_X\) to get \(a_1, \ldots, a_{n-r}\).
The error constrasts \(w_1, \ldots, w_{n-r}\) will be a subset of the elements of the residual vector \((I-P_X)y = y - \hat y\)
Note that \(w = A'y = A'\epsilon \sim N(0, A'\Sigma A)\) and the distribution of \(w\) depends only on \(\gamma\).
The log likelihood function in this case is \[ \ell(\gamma\mid w) = -\frac{1}{2}\log |A'\Sigma A| - \frac{1}{2}w'(A'\Sigma A)^{-1}w - \frac{n-r}{2}\log(2\pi) \] An MLE of \(\gamma\) can be found in the general case using numerical methods to obtain the REML estimate of \(\gamma\).
In 611, we show that every set of \(n-r\) linearly independent error contrasts yields the same REML estimator of \(\gamma\).
In any case, once a REML estimate of \(\gamma\) has been obtained, the BLUE of an estimate \(C\beta\) can be approximated by \(C\hat\beta_{\hat\Sigma} = C(X'\hat \Sigma^{-1} X)^{-}X' \hat\Sigma^{-1} y\), where \(\hat\Sigma\) is \(\Sigma\) with \(\hat\gamma\) (the REML estimate of \(\gamma\)) in place of \(\gamma\).
Example in Chapter 19
lme(SeedlingWeight ̃Genotype,random= ̃1|Tray,data=d)
Linear mixed-effects model fit by REML
Data: d
Log-restricted-likelihood: -123.5705
Fixed: SeedlingWeight ̃ Genotype
(Intercept) Genotype2
15.288838 -3.550201
Random effects:
Formula: ̃1 | Tray
(Intercept) Residual
StdDev: 3.414856 1.88223
Number of Observations: 56
Number of Groups: 8
lmer(SeedlingWeight ̃Genotype+(1|Tray),data=d)
Linear mixed model fit by REML [’lmerMod’]
Formula: SeedlingWeight ̃ Genotype + (1 | Tray)
Data: d
REML criterion at convergence: 247.1411
Random effects:
Groups Name Std.Dev.
Tray (Intercept) 3.415
Residual 1.882
Number of obs: 56, groups: Tray, 8
Fixed Effects:
(Intercept) Genotype2
15.29 -3.55In SAS, REML is also the default for SAS proc mixed.