9.4 Marginal Properties of GLMMs
9.4.1 Marginal Mean of yi
The marginal mean is obtained by integrating over the distribution of the random effects:
E(y_i) = E_{\boldsymbol{\alpha}}(E(y_i \mid \boldsymbol{\alpha})) = E_{\boldsymbol{\alpha}}(\mu_i) = E\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right)
Since g^{-1}(\cdot) is nonlinear, this expectation cannot be simplified further without specific distributional assumptions.
9.4.1.1 Special Case: Log Link Function
For a log-link function, g(\mu) = \log(\mu), the inverse link is the exponential function:
E(y_i) = E\left(\exp(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right)
Using properties of the moment-generating function (MGF):
E(y_i) = \exp(\mathbf{x}_i' \boldsymbol{\beta}) \cdot E\left(\exp(\mathbf{z}_i' \boldsymbol{\alpha})\right)
Here, E(\exp(\mathbf{z}_i' \boldsymbol{\alpha})) is the MGF of \boldsymbol{\alpha} evaluated at \mathbf{z}_i.
9.4.2 Marginal Variance of y_i
The variance decomposition formula applies:
\begin{aligned} \operatorname{Var}(y_i) &= \operatorname{Var}_{\boldsymbol{\alpha}}\left(E(y_i \mid \boldsymbol{\alpha})\right) + E_{\boldsymbol{\alpha}}\left(\operatorname{Var}(y_i \mid \boldsymbol{\alpha})\right) \\ &= \operatorname{Var}(\mu_i) + E\left(a(\phi) V(\mu_i)\right) \end{aligned}
Expressed explicitly:
\operatorname{Var}(y_i) = \operatorname{Var}\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right) + E\left(a(\phi) V\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right)\right)
Without specific assumptions about g(\cdot) and the distribution of \boldsymbol{\alpha}, this is the most general form.
9.4.3 Marginal Covariance of \mathbf{y}
Random effects induce correlation between observations within the same cluster. The covariance between y_i and y_j is:
\begin{aligned} \operatorname{Cov}(y_i, y_j) &= \operatorname{Cov}_{\boldsymbol{\alpha}}\left(E(y_i \mid \boldsymbol{\alpha}), E(y_j \mid \boldsymbol{\alpha})\right) + E_{\boldsymbol{\alpha}}\left(\operatorname{Cov}(y_i, y_j \mid \boldsymbol{\alpha})\right) \\ &= \operatorname{Cov}(\mu_i, \mu_j) + E(0) \\ &= \operatorname{Cov}\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha}), g^{-1}(\mathbf{x}_j' \boldsymbol{\beta} + \mathbf{z}_j' \boldsymbol{\alpha})\right) \end{aligned}
The second term vanishes when y_i and y_j are conditionally independent given \boldsymbol{\alpha}. This dependency structure is a hallmark of mixed models.
Example: Repeated Measurements with a Poisson GLMM
Consider repeated count measurements for subjects:
- Let y_{ij} be the j-th count for subject i.
- Assume y_{ij} \mid \alpha_i \sim \text{independent } \text{Poisson}(\mu_{ij}).
The model is specified as:
\log(\mu_{ij}) = \mathbf{x}_{ij}' \boldsymbol{\beta} + \alpha_i
where:
- \alpha_i \sim \text{i.i.d. } N(0, \sigma^2_{\alpha}) represents subject-specific random effects,
- This is a log-link GLMM with random intercepts for subjects.
The inclusion of \alpha_i accounts for subject-level heterogeneity, capturing unobserved variability across individuals.