## 4.4 Assumptions of the model

Some probabilistic assumptions are required for performing inference on the model parameters $$\boldsymbol{\beta}$$ from the sample $$(\mathbf{X}_1, Y_1),\ldots,(\mathbf{X}_n, Y_n)$$. These assumptions are somehow simpler than the ones for linear regression.

The assumptions of the logistic model are the following:

1. Linearity in the logit31: $$\mathrm{logit}(p(\mathbf{x}))=\log\frac{ p(\mathbf{x})}{1-p(\mathbf{x})}=\beta_0+\beta_1x_1+\ldots+\beta_kx_k$$.
2. Binariness: $$Y_1,\ldots,Y_n$$ are binary variables.
3. Independence: $$Y_1,\ldots,Y_n$$ are independent.
A good one-line summary of the logistic model is the following (independence is assumed) \begin{align} Y|(X_1=x_1,\ldots,X_k=x_k)&\sim\mathrm{Ber}\left(\mathrm{logistic}(\beta_0+\beta_1x_1+\ldots+\beta_kx_k)\right)\nonumber\\ &=\mathrm{Ber}\left(\frac{1}{1+e^{-(\beta_0+\beta_1x_1+\ldots+\beta_kx_k)}}\right).\tag{4.9} \end{align}

There are three important points of the linear model assumptions missing in the ones for the logistic model:

• Why is homoscedasticity not required? As seen in the previous section, Bernoulli variables are determined only by the probability of success, in this case $$p(\mathbf{x})$$. That determines also the variance, which is variable, so there is heteroskedasticity. In the linear model, we have to control $$\sigma^2$$ explicitly due to the higher flexibility of the normal.
• Where are the errors? The errors played a fundamental role in the linear model assumptions, but are not employed in logistic regression. The errors are not fundamental for building the linear model but just a helpful concept related to least squares. The linear model can be constructed without errors as (3.5), which has a logistic analogous in (4.9).
• Why is normality not present? A normal distribution is not adequate to replace the Bernoulli distribution in (4.9) since the response $$Y$$ has to be binary and the Normal or other continuous distribution would put yield illegal values for $$Y$$.

Recall that:

• Nothing is said about the distribution of X1, …, Xk. They could be deterministic or random. They could be discrete or continuous.
• X1, …, Xk are not required to be independent between them.

1. An equivalent way of stating this assumption is $$p(\mathbf{x})=\mathrm{logistic}(\beta_0+\beta_1x_1+\ldots+\beta_kx_k)$$.