9.2 Generalized Linear Mixed Models

GLMMs extend GLMs by incorporating random effects, which allows for modeling data with hierarchical or clustered structures.

The conditional distribution of yi given the random effects \boldsymbol{\alpha}_i is:

y_i \mid \boldsymbol{\alpha}_i \sim \text{independent } f(y_i \mid \boldsymbol{\alpha})

where f(y_i \mid \boldsymbol{\alpha}) belongs to the exponential family of distributions:

f(y_i \mid \boldsymbol{\alpha}) = \exp \left( \frac{y_i \theta_i - b(\theta_i)}{a(\phi)} - c(y_i, \phi) \right)

  • \theta_i is the canonical parameter,
  • a(\phi) is a dispersion parameter,
  • b(\theta_i) and c(y_i, \phi) are specific functions defining the exponential family.

The conditional mean of y_i is related to \theta_i by:

\mu_i = \frac{\partial b(\theta_i)}{\partial \theta_i}

Applying a link function g(\cdot), we relate the mean response to both fixed and random effects:

\begin{aligned} E(y_i \mid \boldsymbol{\alpha}) &= \mu_i \\ g(\mu_i) &= \mathbf{x}_i' \boldsymbol{\beta} + \mathbf{z}_i' \boldsymbol{\alpha} \end{aligned}

  • g(\cdot) is a known link function,
  • \mathbf{x}_i and \mathbf{z}_i are design matrices for fixed and random effects, respectively,
  • \boldsymbol{\beta} represents fixed effects, and \boldsymbol{\alpha} represents random effects.

We also specify the distribution of the random effects:

\boldsymbol{\alpha} \sim f(\boldsymbol{\alpha})

This distribution is often assumed to be multivariate normal (Law of large Number applies to fixed effects) but can be chosen (subjectively) based on the context.