Module 2 Cheat Sheet
Overview
- Generalized Linear Mixed Models used for correlated or nested data, i.e., longitudinal or clustered data, when outcomes are non-normal.
- GLMMs refer to the combination of both fixed and random effects:
- Fixed effects remain constant across all units
- Random effects are allowed to vary across units.
- Non-normal outcome data
- Multilevel/Hierarchical Models
- Reparametrization of random effects in a nested structure.
1. Generalized Linear Models (Module 2A)
Overview
- GLMs extend linear regression to outcomes in the exponential family (e.g., Bernoulli, Binomial, Poisson).
- Three parts: random component (distribution of \(Y\)), systematic component (\(\eta_i = x_i' \beta\)), link \(g(\mu_i)=\eta_i\).
- No explicit \(\sigma^2\) noise term in the basic GLM form.
Exponential Family Form
- The \(Exp\) family form is given by:
\[ f_{\theta} (y) = exp[\{y\theta - b(\theta)\} / a(\phi) + c(y, \phi )] \]
where \(a, b, c\) are known functions, and \(\phi\) is a known scale parameter. The ONLY parameter is \(\theta\).
Core model
- \(E(Y_i|X_i)=\mu_i,\quad g(\mu_i)=x_i'\beta\).
- Examples: Bernoulli–logit, Binomial–logit, Poisson–log.
Logistic regression (Bernoulli)
- Link: \(g(p)=\log\!\left(\frac{p}{1-p}\right)\).
- Interpret \(\beta_j\): change in log-odds per 1-unit in \(X_i\); \(e^{\beta_j} = \frac{\{\frac{p(y_i=1|x_i=1}{p(y_i=0|x_i=1}\}}{\frac{p(y_i=1|x_i=0}{p(y_i=0|x_i=-}\}} = \frac{\text{odds(Exposed)}}{\text{odds(Unexposed)}}\) is the odds ratio (OR).
Inference
- Wald test: \(Z=\hat\beta_j/\text{SE}(\hat\beta_j)\) is used to evaluate significance of individual regression coefficients.
- LRT (nested models): \(\lambda=-2[l(\text{reduced})-l(\text{full})]\sim\chi^2\) is commonly used to compare model fits.
PTB example (logit)
- Significant: male (OR ≈ 1.07), tobacco (OR ≈ 1.51).
- Baseline probability (female, age 25, non-smoker): \(\approx 0.08\).
- CIs on ORs by exponentiating the coefficient CIs (don’t exponentiate SEs).
Assumptions
- Correct link; independent observations; correct mean–variance for chosen family (e.g., \(E(Y)=p, \text{Var}(Y)=p(1-p)\) for Bernoulli).
2. Poisson Regression (Module 2b)
When to use
- Counts \(y_i\in\{0,1,2,\dots\}\) without a fixed number of trials.
- Poisson assumes mean = variance: \(E(y_i)=\text{Var}(y_i)=\theta_i\).
- Commonly used for rare diseases to assess rates.
Model & link
- \(y_i\sim \text{Poisson}(\theta_i),\quad \log(\theta_i)=x_i'\beta\).
- $e^{_j} = $$ exp(_j) = $ = incidence rate ratio (IRR).
Offsets (rates)
- Model rates via \(log(\theta_i)=log(N_i) + x_i'\beta\) with offset \(log(N_i)\).
- The offset captures the “population-at-risk” or the denominator of the rate.
Interpretation workflow
- One-unit increase in \(X_j\): IRR \(= e^{\beta_j} = \theta_i^*/\theta_i\).
Binomial vs Poisson (quick distinction)
- Binomial/Logit: bounded successes out of \(n_i\).
- Poisson/Log: unbounded event counts (often with exposure/rate).
Assumptions
- Independent counts; correct link; mean=variance (watch for overdispersion).
3. Generalized Linear Mixed Models (Module 2C)
Motivation
- Outcomes from exponential family with clustering (repeated measures, groups).
- GLMs assume independence → can understate SEs. GLMMs add random effects to model within-cluster correlation.
GLMM form
- \(y_{ij}\sim \text{Dist}(p_{ij}),\quad g(p_{ij})=\beta_0 + x_{ij}'\beta + \theta_i, \quad \theta_i\sim N(0,\tau^2)\).
Likelihood sketch
- \(L(\beta,\tau^2|y) = \prod_i \int \big(\prod_j [y_{ij}|\theta_i]\big)[\theta_i]\,d\theta_i\) (requires numerical approximation: Laplace/adaptive quadrature).
Crossover example (logit)
- Compare GLMs (independence) vs GLMM (random intercept for subject).
- GLMM estimates often larger but with larger SEs; \(\tau^2\) captures subject heterogeneity.
- Predicted probabilities differ by population-averaged (GLM) vs subject-specific (GLMM) interpretations.
Poisson GLMM example (Ohio lung cancer)
- Random-intercept Poisson with offset(log population).
- Interpret fixed effects as IRRs; random-effect SD reflects county heterogeneity.
- Practical note: rescale continuous predictors when faced with identifiability warnings.
Model comparison (review)
- Nested: LRT; AIC/BIC for relative fit (same data, same likelihood type).
Overdispersion note
- If Poisson mean \(\neq\) variance, quasi-Poisson can adjust SEs via a scale parameter.
4. Multilevel/Hierarchical Models (Module 2D)
Motivation
- Nested data (years within students within schools) → multiple correlation layers.
- Need random effects at each level.
Three-level LMM
- \(y_{ijk} = X_{ijk}\beta + u_i + u_{ij} + \epsilon_{ijk}\) \(u_i\sim N(0,\nu^2)\) (school), \(u_{ij}\sim N(0,\tau^2)\) (student within school), \(\epsilon_{ijk}\sim N(0,\sigma^2)\).
Pooling (partial pooling)
- Across schools: shrink toward overall mean with weight \(w_i=\frac{n_i}{n_i+\sigma^2/\nu^2}\).
- Across students (within schools): \(w_{ij}=\frac{n_{ij}}{n_{ij}+\sigma^2/\tau^2}\).
Variance decomposition & ICCs
- Total variance: \(\nu^2+\tau^2+\sigma^2\).
\[ \text{ICC}_{\text{school}}=\frac{\nu^2}{\nu^2+\tau^2+\sigma^2},\quad \text{ICC}_{\text{student}}=\frac{\nu^2+\tau^2}{\nu^2+\tau^2+\sigma^2}. \]
Interpretation (school math example)
- Report fixed effects (e.g., year trend) and random-effect SDs at school, student, and residual levels.
- Compute ICCs from fitted variance components.
Fitting
- Random intercepts:
(1 | school/child)
; summarize fixed effects; extract VarCorr; compute ICCs.