6.21 Log-binomial regression to estimate a risk ratio or prevalence ratio
Logistic regression is a special case of a family of models known as generalized linear models. Each member of this family has an assumed distribution for the outcome and a link function that connects the mean outcome to a linear combination of predictors \(\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K\) (the linear predictor). In logistic regression, the outcome is assumed to have a binomial distribution and the link function is the logit function \(\ln(p/(1-p))\). Linear regression is also a special case, with a normal distribution and an identity link function (the mean is assumed to be equal to the linear predictor).
Another special case of a generalized linear model is the log-binomial regression model which, like logistic regression, assumes a binomial distribution for a binary outcome but, unlike logistic regression, uses a log link function as shown in Equation (6.2).
\[\begin{equation} \ln{p} = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K \tag{6.2} \end{equation}\]
With logistic regression, the left-hand side is the log of the odds, whereas in log-binomial regression it is the log of the probability. Exponentiating a regression coefficient in logistic regression results in an odds ratio. Similarly, exponentiating a regression coefficient in log-binomial regression results in a risk ratio (RR) or prevalence ratio (PR). The model described by Equation (6.2) can be used to estimate an RR from incidence data or a PR from prevalence data. Thus, for a predictor \(X_k\), the RR or PR is \(e^{\beta_k}\).
A disadvantage of log-binomial regression is that the left-hand side \((\ln{p})\) is constrained to be positive, while the right-hand side can be anything from \(-\infty\) to \(\infty\). This leads to convergence issues at times (Williamson, Eliasziw, and Fick 2013). One method for fitting a log-binomial model is to use glm()
with family = binomial(link="log")
. Alternatively, use the logbin()
function in the logbin
package (Donoghoe and Marschner 2018) which may converge even in cases where glm()
fails.
Example 6.2 (continued): Logistic regression estimated an OR comparing lifetime marijuana use between males and females of 1.44. Use log-binomial regression to compute the corresponding prevalence ratio.
library(logbin)
fit.ex6.2.logbin <- logbin(mj_lifetime ~ demog_sex,
data = nsduh,
method = "em")
# Summary of model
round(summary(fit.ex6.2.logbin)$coef, 4)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.7629 0.0463 -16.479 0.0000
## demog_sexMale 0.1794 0.0620 2.894 0.0038
# PR, and 95% CI for PR
PR.CI <- cbind("PR" = exp(coef(fit.ex6.2.logbin)),
exp(confint(fit.ex6.2.logbin)))[-1,]
round(PR.CI, 3)
## PR 2.5 % 97.5 %
## 1.197 1.060 1.351
Although not needed for this example, if the predictor were categorical with more than two levels, then you can obtain a Type III multiple df test as usual.
Conclusion: Males are 1.20 times as likely to have ever used marijuana than females (PR = 1.20; 95% CI = 1.06, 1.35; p = .004).
In the interpretation, we used the phrase “times as likely” rather than “times the odds” because log-binomial regression models the log of the probability, not the log-odds. We could also say that the prevalence of marijuana use is 20% greater among males. If this were incidence data, we could say that males have 20% greater risk. To compute an adjusted RR or PR, simply add the confounding variables to the model formula.
NOTES:
- If you use
predict()
orgmodels::estimable()
to estimate a probability from a log-binomial model, useexp()
rather thanilogit()
when transforming the prediction to the probability scale. logbin()
does not allow interaction terms using the:
notation. Ifglm()
withfamily(link = "log")
converges, then that is the simplest way to include an interaction since it does allow the:
notation. To include an interaction withlogbin
, you must create variables corresponding to the interaction terms outside the model and then include those variables in the model (see Section 9.6.4.2 for an example, from a different context, of how to do this).