9.2 Generalized Linear Mixed Models

GLMMs extend GLMs by incorporating random effects, which allows for modeling data with hierarchical or clustered structures.

The conditional distribution of \(y_i\) given the random effects \(\boldsymbol{\alpha}_i\) is:

\[ y_i \mid \boldsymbol{\alpha}_i \sim \text{independent } f(y_i \mid \boldsymbol{\alpha}) \]

where \(f(y_i \mid \boldsymbol{\alpha})\) belongs to the exponential family of distributions:

\[ f(y_i \mid \boldsymbol{\alpha}) = \exp \left( \frac{y_i \theta_i - b(\theta_i)}{a(\phi)} - c(y_i, \phi) \right) \]

  • \(\theta_i\) is the canonical parameter,
  • \(a(\phi)\) is a dispersion parameter,
  • \(b(\theta_i)\) and \(c(y_i, \phi)\) are specific functions defining the exponential family.

The conditional mean of \(y_i\) is related to \(\theta_i\) by:

\[ \mu_i = \frac{\partial b(\theta_i)}{\partial \theta_i} \]

Applying a link function \(g(\cdot)\), we relate the mean response to both fixed and random effects:

\[ \begin{aligned} E(y_i \mid \boldsymbol{\alpha}) &= \mu_i \\ g(\mu_i) &= \mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha} \end{aligned} \]

  • \(g(\cdot)\) is a known link function,
  • \(\mathbf{x}_i\) and \(\mathbf{z}_i\) are design matrices for fixed and random effects, respectively,
  • \(\boldsymbol{\beta}\) represents fixed effects, and \(\boldsymbol{\alpha}\) represents random effects.

We also specify the distribution of the random effects:

\[ \boldsymbol{\alpha} \sim f(\boldsymbol{\alpha}) \]

This distribution is often assumed to be multivariate normal (Law of large Number applies to fixed effects) but can be chosen (subjectively) based on the context.