Chapter 20 REML Estimation of Variance Components
Consider the GLM: \[ y = X\beta + \epsilon, \text{ where } \epsilon \sim N(0, \Sigma) \] and \(\Sigma\) is an \(n\times n\) positive definite variance matrix which depends on an unknown parameter vector \(\gamma\).
Residual/Restricted Maximum Likelihood (REML): an approach that produces unbiased estimators for these special cases and produces less biased estimates than ML in general.
REML Method
Find \(n - \text{rank}(X) = n - r\) linearly independent vectors \(a_1, \ldots, a_{n-r}\) such that \(a_i'X = 0'\) for all \(i = 1, \ldots, n-r\).
Find the MLE of \(\gamma\) using \(w_1 \equiv a_1'y, \ldots, w_{n-r} \equiv a_{n-r}'y\) as data. \[ A = [a_1, \ldots, a_{n-r}] \quad w = \begin{bmatrix} w_1 \\ \vdots \\ w_{n-r} \end{bmatrix} = \begin{bmatrix} a_1'y \\ \vdots \\ a_{n-r}'y \end{bmatrix} = A'y \]
If \(a'X = 0'\), \(a'y\) is known as error contrast. Thus, \(w_1, \ldots, w_{n-r}\) comprise a set of \(n-r\) error constrasts
There exist \(n-r\) linearly independent rows of \(I - P_X\) to get \(a_1, \ldots, a_{n-r}\).
The error constrasts \(w_1, \ldots, w_{n-r}\) will be a subset of the elements of the residual vector \((I-P_X)y = y - \hat y\)
Note that \(w = A'y = A'\epsilon \sim N(0, A'\Sigma A)\) and the distribution of \(w\) depends only on \(\gamma\).
The log likelihood function in this case is \[ \ell(\gamma\mid w) = -\frac{1}{2}\log |A'\Sigma A| - \frac{1}{2}w'(A'\Sigma A)^{-1}w - \frac{n-r}{2}\log(2\pi) \] An MLE of \(\gamma\) can be found in the general case using numerical methods to obtain the REML estimate of \(\gamma\).
In 611, we show that every set of \(n-r\) linearly independent error contrasts yields the same REML estimator of \(\gamma\).
In any case, once a REML estimate of \(\gamma\) has been obtained, the BLUE of an estimate \(C\beta\) can be approximated by \(C\hat\beta_{\hat\Sigma} = C(X'\hat \Sigma^{-1} X)^{-}X' \hat\Sigma^{-1} y\), where \(\hat\Sigma\) is \(\Sigma\) with \(\hat\gamma\) (the REML estimate of \(\gamma\)) in place of \(\gamma\).
Example in Chapter 19
lme(SeedlingWeight ̃Genotype,random= ̃1|Tray,data=d)
-effects model fit by REML
Linear mixed: d
Data-restricted-likelihood: -123.5705
Log: SeedlingWeight ̃ Genotype
Fixed
(Intercept) Genotype215.288838 -3.550201
:
Random effects: ̃1 | Tray
Formula
(Intercept) Residual: 3.414856 1.88223
StdDev: 56
Number of Observations: 8
Number of Groups
lmer(SeedlingWeight ̃Genotype+(1|Tray),data=d)
Linear mixed model fit by REML [’lmerMod’]: SeedlingWeight ̃ Genotype + (1 | Tray)
Formula: d
Data: 247.1411
REML criterion at convergence:
Random effects
Groups Name Std.Dev.Tray (Intercept) 3.415
1.882
Residual : 56, groups: Tray, 8
Number of obs:
Fixed Effects
(Intercept) Genotype215.29 -3.55
In SAS, REML is also the default for SAS proc mixed
.