6.2 Interpretation of the logistic regression coefficients

How do we interpret the logistic regression coefficients? To answer this question, we need to dive into some mathematical details, although, in the end, we will use R to do all the computations for us.

If \(p\) is the probability of an event, then \(p/(1-p)\) is the “odds” of the event, the ratio of how likely the event is to occur and how likely it is to not occur. The left-hand side of the logistic regression equation \(\ln{(p/(1-p))}\) is the natural logarithm of the odds, also known as the “log-odds” or “logit”. To convert log-odds to odds, apply the inverse of the natural logarithm which is the exponential function \(e^x\). To convert log-odds to a probability, apply the inverse logit function \(e^x / (1 + e^x)\).

Intercept

Plugging \(X = 0\) into Equation (6.1), we find that the intercept \(\beta_0\) is \(\ln(p/(1-p))\), the log-odds when all predictors are 0 or at their reference level. Applying the exponential function demonstrates that \(e^{\beta_0}\) is the corresponding odds of the outcome. Applying the inverse logit function demonstrates that \(e^{\beta_0} / (1 + e^{\beta_0})\) is the corresponding probability of the outcome.

Predictor coefficients

In linear regression, \(\beta_k\) was the difference in the outcome associated with a 1-unit difference in \(X_k\) (or between a level and the reference level). Similarly, in logistic regression, it is the difference in the log-odds of the outcome associated with a 1-unit difference in \(X_k\). It turns out that \(e^{\beta_k}\) is the odds ratio (OR) comparing individuals who differ by 1-unit in \(X_k\). To see why, start by exponentiating both sides of the logistic regression equation to get the odds as a function of the predictors.

\[ \frac{p}{1-p} = e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K} \]

For a continuous predictor \(X_1\), the ratio of the odds at \(X_1 = x_1 + 1\) to the odds at \(X_1 = x_1\) (a one-unit difference) can be expressed as the following ratio, for which all the terms cancel except \(e^{\beta_1}\).

\[e^{\beta_0 + \beta_1 (X_1 + 1) + \beta_2 X_2 + \ldots + \beta_K X_K} / e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K} = e^{\beta_1}\]

If the first predictor is instead categorical and we want the OR comparing the first non-reference level to the reference level, then we want the ratio of the odds at \(X_1 = 1\) to the odds at \(X_1 = 0\). This is a 1-unit difference, so the derivation above also applies to categorical predictors.

Summary of interpretation of regression coefficients

The intercept is the log-odds of the outcome when all predictors are at 0 or their reference level. Use the exponential function \((e^{\beta_0})\) to convert the intercept to odds and the inverse logit function \(\left(e^{\beta_0} / (1 + e^{\beta_0})\right)\) to convert the intercept to a probability.
For a continuous predictor, the regression coefficient is the log of the odds ratio comparing individuals who differ in that predictor by one unit, holding the other predictors fixed.
For a categorical predictor, each regression coefficient is the log of the odds ratio comparing individuals at a given level of the predictor to those at the reference level, holding the other predictors fixed.
To compute an OR for \(X_k\), exponentiate the corresponding regression coefficient, \(e^{\beta_k}\), thus converting the log of the odds ratio to an OR.
When there are multiple predictors, ORs are called adjusted ORs (AORs).