Module 2 Cheat Sheet

Overview

Generalized Linear Mixed Models used for correlated or nested data, i.e., longitudinal or clustered data, when outcomes are non-normal.
GLMMs refer to the combination of both fixed and random effects:
- Fixed effects remain constant across all units
- Random effects are allowed to vary across units.
- Non-normal outcome data
Multilevel/Hierarchical Models
- Reparametrization of random effects in a nested structure.

1. Generalized Linear Models (Module 2A)

Overview

GLMs extend linear regression to outcomes in the exponential family (e.g., Bernoulli, Binomial, Poisson).
Three parts: random component (distribution of $Y$), systematic component ($\eta_i = x_i' \beta$), link $g(\mu_i)=\eta_i$.
No explicit $\sigma^2$ noise term in the basic GLM form.

Exponential Family Form

The $Exp$ family form is given by:

\[ f_{\theta} (y) = exp[\{y\theta - b(\theta)\} / a(\phi) + c(y, \phi )] \]

where $a, b, c$ are known functions, and $\phi$ is a known scale parameter. The ONLY parameter is $\theta$.

Core model

$E(Y_i|X_i)=\mu_i,\quad g(\mu_i)=x_i'\beta$.
Examples: Bernoulli–logit, Binomial–logit, Poisson–log.

Logistic regression (Bernoulli)

Link: $g(p)=\log\!\left(\frac{p}{1-p}\right)$.
Interpret $\beta_j$: change in log-odds per 1-unit in $X_i$; $e^{\beta_j} = \frac{\{\frac{p(y_i=1|x_i=1}{p(y_i=0|x_i=1}\}}{\frac{p(y_i=1|x_i=0}{p(y_i=0|x_i=-}\}} = \frac{\text{odds(Exposed)}}{\text{odds(Unexposed)}}$ is the odds ratio (OR).

Inference

Wald test: $Z=\hat\beta_j/\text{SE}(\hat\beta_j)$ is used to evaluate significance of individual regression coefficients.
LRT (nested models): $\lambda=-2[l(\text{reduced})-l(\text{full})]\sim\chi^2$ is commonly used to compare model fits.

PTB example (logit)

Significant: male (OR ≈ 1.07), tobacco (OR ≈ 1.51).
Baseline probability (female, age 25, non-smoker): $\approx 0.08$.
CIs on ORs by exponentiating the coefficient CIs (don’t exponentiate SEs).

Assumptions

Correct link; independent observations; correct mean–variance for chosen family (e.g., $E(Y)=p, \text{Var}(Y)=p(1-p)$ for Bernoulli).

2. Poisson Regression (Module 2b)

When to use

Counts $y_i\in\{0,1,2,\dots\}$ without a fixed number of trials.
Poisson assumes mean = variance: $E(y_i)=\text{Var}(y_i)=\theta_i$.
Commonly used for rare diseases to assess rates.

Model & link

$y_i\sim \text{Poisson}(\theta_i),\quad \log(\theta_i)=x_i'\beta$.
$e^{_j} = $$ exp(_j) = $ = incidence rate ratio (IRR).

Offsets (rates)

Model rates via $log(\theta_i)=log(N_i) + x_i'\beta$ with offset $log(N_i)$.
The offset captures the “population-at-risk” or the denominator of the rate.

Interpretation workflow

One-unit increase in $X_j$: IRR $= e^{\beta_j} = \theta_i^*/\theta_i$.

Binomial vs Poisson (quick distinction)

Binomial/Logit: bounded successes out of $n_i$.
Poisson/Log: unbounded event counts (often with exposure/rate).

Assumptions

Independent counts; correct link; mean=variance (watch for overdispersion).

3. Generalized Linear Mixed Models (Module 2C)

Motivation

Outcomes from exponential family with clustering (repeated measures, groups).
GLMs assume independence → can understate SEs. GLMMs add random effects to model within-cluster correlation.

GLMM form

$y_{ij}\sim \text{Dist}(p_{ij}),\quad g(p_{ij})=\beta_0 + x_{ij}'\beta + \theta_i, \quad \theta_i\sim N(0,\tau^2)$.

Likelihood sketch

$L(\beta,\tau^2|y) = \prod_i \int \big(\prod_j [y_{ij}|\theta_i]\big)[\theta_i]\,d\theta_i$ (requires numerical approximation: Laplace/adaptive quadrature).

Crossover example (logit)

Compare GLMs (independence) vs GLMM (random intercept for subject).
GLMM estimates often larger but with larger SEs; $\tau^2$ captures subject heterogeneity.
Predicted probabilities differ by population-averaged (GLM) vs subject-specific (GLMM) interpretations.

Poisson GLMM example (Ohio lung cancer)

Random-intercept Poisson with offset(log population).
Interpret fixed effects as IRRs; random-effect SD reflects county heterogeneity.
Practical note: rescale continuous predictors when faced with identifiability warnings.

Model comparison (review)

Nested: LRT; AIC/BIC for relative fit (same data, same likelihood type).

Overdispersion note

If Poisson mean $\neq$ variance, quasi-Poisson can adjust SEs via a scale parameter.

4. Multilevel/Hierarchical Models (Module 2D)

Motivation

Nested data (years within students within schools) → multiple correlation layers.
Need random effects at each level.

Three-level LMM

$y_{ijk} = X_{ijk}\beta + u_i + u_{ij} + \epsilon_{ijk}$ $u_i\sim N(0,\nu^2)$ (school), $u_{ij}\sim N(0,\tau^2)$ (student within school), $\epsilon_{ijk}\sim N(0,\sigma^2)$.

Pooling (partial pooling)

Across schools: shrink toward overall mean with weight $w_i=\frac{n_i}{n_i+\sigma^2/\nu^2}$.
Across students (within schools): $w_{ij}=\frac{n_{ij}}{n_{ij}+\sigma^2/\tau^2}$.

Variance decomposition & ICCs

Total variance: $\nu^2+\tau^2+\sigma^2$.

\[ \text{ICC}_{\text{school}}=\frac{\nu^2}{\nu^2+\tau^2+\sigma^2},\quad \text{ICC}_{\text{student}}=\frac{\nu^2+\tau^2}{\nu^2+\tau^2+\sigma^2}. \]

Interpretation (school math example)

Report fixed effects (e.g., year trend) and random-effect SDs at school, student, and residual levels.
Compute ICCs from fitted variance components.

Fitting

Random intercepts: (1 | school/child); summarize fixed effects; extract VarCorr; compute ICCs.