7.5 Negative Binomial Regression
When modeling count data, Poisson regression assumes that the mean and variance are equal:
Var(Yi)=E(Yi)=μi However, in many real-world datasets, the variance exceeds the mean—a phenomenon known as overdispersion. When overdispersion is present, the Poisson model underestimates the variance, leading to:
Inflated test statistics (small p-values).
Overconfident predictions.
Poor model fit.
7.5.1 Negative Binomial Distribution
To address overdispersion, Negative Binomial (NB) regression introduces an extra dispersion parameter θ to allow variance to be greater than the mean: Var(Yi)=μi+θμ2i where:
μi=exp(x′iθ) is the expected count.
θ is the dispersion parameter.
When θ→0, the NB model reduces to the Poisson model.
Thus, Negative Binomial regression is a generalization of Poisson regression that accounts for overdispersion.
7.5.2 Application: Negative Binomial Regression
We apply Negative Binomial regression to the bioChemists
dataset to model the number of research articles (Num_Article
) as a function of several predictors.
7.5.2.1 Fitting the Negative Binomial Model
# Load necessary package
library(MASS)
# Fit Negative Binomial model
NegBinom_Mod <- MASS::glm.nb(Num_Article ~ ., data = bioChemists)
# Model summary
summary(NegBinom_Mod)
#>
#> Call:
#> MASS::glm.nb(formula = Num_Article ~ ., data = bioChemists, init.theta = 2.264387695,
#> link = log)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -2.1678 -1.3617 -0.2806 0.4476 3.4524
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.256144 0.137348 1.865 0.062191 .
#> SexWomen -0.216418 0.072636 -2.979 0.002887 **
#> MarriedMarried 0.150489 0.082097 1.833 0.066791 .
#> Num_Kid5 -0.176415 0.052813 -3.340 0.000837 ***
#> PhD_Quality 0.015271 0.035873 0.426 0.670326
#> Num_MentArticle 0.029082 0.003214 9.048 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for Negative Binomial(2.2644) family taken to be 1)
#>
#> Null deviance: 1109.0 on 914 degrees of freedom
#> Residual deviance: 1004.3 on 909 degrees of freedom
#> AIC: 3135.9
#>
#> Number of Fisher Scoring iterations: 1
#>
#>
#> Theta: 2.264
#> Std. Err.: 0.271
#>
#> 2 x log-likelihood: -3121.917
Interpretation:
The coefficients are on the log scale.
The dispersion parameter θ (also called size parameter in some contexts) is estimated as 2.264 with a standard error of 0.271. Check Over-Dispersion for more detail.
Since θ is significantly different from 1, this confirms overdispersion, validating the choice of the Negative Binomial model over Poisson regression.
7.5.2.2 Model Comparison: Poisson vs. Negative Binomial
7.5.2.2.1 Checking Overdispersion in Poisson Model
Before using NB regression, we confirm overdispersion by computing:
ˆϕ=deviancedegrees of freedom
# Overdispersion check for Poisson model
Poisson_Mod$deviance / Poisson_Mod$df.residual
#> [1] 1.797988
If ˆϕ>1, overdispersion is present.
A large value suggests that Poisson regression underestimates variance.
7.5.2.2.2 Likelihood Ratio Test: Poisson vs. Negative Binomial
We compare the Poisson and Negative Binomial models using a likelihood ratio test, where: G2=2×(logLNB−logLPoisson) with df=1
# Likelihood ratio test between Poisson and Negative Binomial
pchisq(2 * (logLik(NegBinom_Mod) - logLik(Poisson_Mod)),
df = 1,
lower.tail = FALSE)
#> 'log Lik.' 4.391728e-41 (df=7)
Small p-value (< 0.05) → Negative Binomial model is significantly better.
Large p-value (> 0.05) → Poisson model is adequate.
Since overdispersion is confirmed, the Negative Binomial model is preferred.
7.5.2.3 Model Diagnostics and Evaluation
7.5.2.4 Predictions and Rate Ratios
In Negative Binomial regression, exponentiating the coefficients gives rate ratios:
# Convert coefficients to rate ratios
data.frame(`Odds Ratios` = exp(coef(NegBinom_Mod)))
#> Odds.Ratios
#> (Intercept) 1.2919388
#> SexWomen 0.8053982
#> MarriedMarried 1.1624030
#> Num_Kid5 0.8382698
#> PhD_Quality 1.0153884
#> Num_MentArticle 1.0295094
A rate ratio of:
> 1 → Increases expected article count.
< 1 → Decreases expected article count.
= 1 → No effect.
For example:
If
PhD_Quality
has an exponentiated coefficient of 1.5, individuals from higher-quality PhD programs are expected to publish 50% more articles.If
Sex
has an exponentiated coefficient of 0.8, females publish 20% fewer articles than males, all else equal.
7.5.2.5 Alternative Approach: Zero-Inflated Models
If a dataset has excess zeros (many individuals publish no articles), Zero-Inflated Negative Binomial (ZINB) models may be required.
P(Yi=0)=p+(1−p)f(Yi=0|μ,θ)
where:
p is the probability of always being a zero (e.g., inactive researchers).
f(Yi) follows the Negative Binomial distribution.
7.5.3 Fitting a Zero-Inflated Negative Binomial Model
# Load package for zero-inflated models
library(pscl)
# Fit ZINB model
ZINB_Mod <- zeroinfl(Num_Article ~ ., data = bioChemists, dist = "negbin")
# Model summary
summary(ZINB_Mod)
#>
#> Call:
#> zeroinfl(formula = Num_Article ~ ., data = bioChemists, dist = "negbin")
#>
#> Pearson residuals:
#> Min 1Q Median 3Q Max
#> -1.2942 -0.7601 -0.2909 0.4448 6.4155
#>
#> Count model coefficients (negbin with log link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.4167466 0.1435964 2.902 0.00371 **
#> SexWomen -0.1955076 0.0755926 -2.586 0.00970 **
#> MarriedMarried 0.0975826 0.0844520 1.155 0.24789
#> Num_Kid5 -0.1517321 0.0542061 -2.799 0.00512 **
#> PhD_Quality -0.0006998 0.0362697 -0.019 0.98461
#> Num_MentArticle 0.0247862 0.0034927 7.097 1.28e-12 ***
#> Log(theta) 0.9763577 0.1354696 7.207 5.71e-13 ***
#>
#> Zero-inflation model coefficients (binomial with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.19161 1.32280 -0.145 0.88483
#> SexWomen 0.63587 0.84890 0.749 0.45382
#> MarriedMarried -1.49944 0.93866 -1.597 0.11017
#> Num_Kid5 0.62841 0.44277 1.419 0.15583
#> PhD_Quality -0.03773 0.30801 -0.123 0.90250
#> Num_MentArticle -0.88227 0.31622 -2.790 0.00527 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Theta = 2.6548
#> Number of iterations in BFGS optimization: 27
#> Log-likelihood: -1550 on 13 Df
This model accounts for:
Structural zero inflation.
Overdispersion.
ZINB is often preferred when many observations are zero. However, since ZINB does not fall under the GLM framework, we will discuss it further in Nonlinear and Generalized Linear Mixed Models.
Why ZINB is Not a GLM?
Unlike GLMs, which assume a single response distribution from the exponential family, ZINB is a mixture model with two components:
Count model – A negative binomial regression for the main count process.
Inflation model – A logistic regression for excess zeros.
Because ZINB combines two distinct processes rather than using a single exponential family distribution, it does not fit within the standard GLM framework.
What ZINB Belongs To
ZINB is part of finite mixture models and is sometimes considered within generalized linear mixed models (GLMMs) or semi-parametric models.