9.4 Marginal Properties of GLMMs
9.4.1 Marginal Mean of \(y_i\)
The marginal mean is obtained by integrating over the distribution of the random effects:
\[ E(y_i) = E_{\boldsymbol{\alpha}}(E(y_i \mid \boldsymbol{\alpha})) = E_{\boldsymbol{\alpha}}(\mu_i) = E\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right) \]
Since \(g^{-1}(\cdot)\) is nonlinear, this expectation cannot be simplified further without specific distributional assumptions.
9.4.1.1 Special Case: Log Link Function
For a log-link function, \(g(\mu) = \log(\mu)\), the inverse link is the exponential function:
\[ E(y_i) = E\left(\exp(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right) \]
Using properties of the moment-generating function (MGF):
\[ E(y_i) = \exp(\mathbf{x}_i' \boldsymbol{\beta}) \cdot E\left(\exp(\mathbf{z}_i' \boldsymbol{\alpha})\right) \]
Here, \(E(\exp(\mathbf{z}_i' \boldsymbol{\alpha}))\) is the MGF of \(\boldsymbol{\alpha}\) evaluated at \(\mathbf{z}_i\).
9.4.2 Marginal Variance of \(y_i\)
The variance decomposition formula applies:
\[ \begin{aligned} \operatorname{Var}(y_i) &= \operatorname{Var}_{\boldsymbol{\alpha}}\left(E(y_i \mid \boldsymbol{\alpha})\right) + E_{\boldsymbol{\alpha}}\left(\operatorname{Var}(y_i \mid \boldsymbol{\alpha})\right) \\ &= \operatorname{Var}(\mu_i) + E\left(a(\phi) V(\mu_i)\right) \end{aligned} \]
Expressed explicitly:
\[ \operatorname{Var}(y_i) = \operatorname{Var}\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right) + E\left(a(\phi) V\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha})\right)\right) \]
Without specific assumptions about \(g(\cdot)\) and the distribution of \(\boldsymbol{\alpha}\), this is the most general form.
9.4.3 Marginal Covariance of \(\mathbf{y}\)
Random effects induce correlation between observations within the same cluster. The covariance between \(y_i\) and \(y_j\) is:
\[ \begin{aligned} \operatorname{Cov}(y_i, y_j) &= \operatorname{Cov}_{\boldsymbol{\alpha}}\left(E(y_i \mid \boldsymbol{\alpha}), E(y_j \mid \boldsymbol{\alpha})\right) + E_{\boldsymbol{\alpha}}\left(\operatorname{Cov}(y_i, y_j \mid \boldsymbol{\alpha})\right) \\ &= \operatorname{Cov}(\mu_i, \mu_j) + E(0) \\ &= \operatorname{Cov}\left(g^{-1}(\mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha}), g^{-1}(\mathbf{x}_j' \boldsymbol{\beta} + \mathbf{z}_j' \boldsymbol{\alpha})\right) \end{aligned} \]
The second term vanishes when \(y_i\) and \(y_j\) are conditionally independent given \(\boldsymbol{\alpha}\). This dependency structure is a hallmark of mixed models.
Example: Repeated Measurements with a Poisson GLMM
Consider repeated count measurements for subjects:
- Let \(y_{ij}\) be the \(j\)-th count for subject \(i\).
- Assume \(y_{ij} \mid \alpha_i \sim \text{independent } \text{Poisson}(\mu_{ij})\).
The model is specified as:
\[ \log(\mu_{ij}) = \mathbf{x}_{ij}' \boldsymbol{\beta} + \alpha_i \]
where:
- \(\alpha_i \sim \text{i.i.d. } N(0, \sigma^2_{\alpha})\) represents subject-specific random effects,
- This is a log-link GLMM with random intercepts for subjects.
The inclusion of \(\alpha_i\) accounts for subject-level heterogeneity, capturing unobserved variability across individuals.