Chapter 5 Generalized linear models
As we saw in Chapter 2, linear regression assumes that the response variable Y is such that
Y|(X1=x1,…,Xp=xp)∼N(β0+β1x1+⋯+βpxp,σ2)
and hence
E[Y|X1=x1,…,Xp=xp]=β0+β1x1+⋯+βpxp.
This, in particular, implies that Y is continuous. In this chapter we will see how generalized linear models can deal with other kinds of distributions for Y|(X1=x1,…,Xp=xp), particularly with discrete responses, by modeling the transformed conditional expectation. The simplest generalized linear model is logistic regression, which arises when Y is a binary response, that is, a variable encoding two categories with 0 and 1. This model would be useful, for example, to predict Y given X from the sample {(Xi,Yi)}ni=1 in Figure 5.1.

Figure 5.1: Scatterplot of a sample {(Xi,Yi)}ni=1 sampled from a logistic regression.