# Chapter 1 Review of Generalised Linear Models

Definition. A GLM is specified through the following components:

• A linear predictor: $$\eta = \boldsymbol{\beta}^{T}\boldsymbol{x}$$.

• An injective response function $$h$$, such that $$\mu = {\mathrm E}[Y |\boldsymbol{x}, \boldsymbol{\beta}] = h(\eta) = h(\boldsymbol{\beta}^{T}\boldsymbol{x})$$.
Equivalently, one can write $$g(\mu) = \boldsymbol{\beta}^{T}\boldsymbol{x}$$, where $$g = h^{-1}$$ is the link function.

• The distributional assumption: $$P_{}\left(Y |\boldsymbol{x}, \boldsymbol{\beta}\right)$$ is an EDF, that is: $$$P_{}\left(y |\boldsymbol{x}, \boldsymbol{\beta}\right) = P_{}\left(y |\theta(\boldsymbol{x}, \boldsymbol{\beta}), \phi(\boldsymbol{x}, \boldsymbol{\beta})\right) = \exp \Big( \frac{y\theta - b(\theta)}{\phi} + c(y, \phi) \Big)$$$ Thus, the mean and variance of this distribution are: \begin{align} {\mathrm E}[Y |\theta, \phi] &= \mu = b'(\theta) \\ {\mathrm{Var}}[Y |\theta, \phi] &= \phi \, b''(\theta) = \phi \, b''((b')^{-1}(\mu)) = \phi \, \mathcal{V}(\mu) \end{align}

• We also assume independent data, that is: $$$P_{}\left(\left\{y_{i}\right\} |\left\{\boldsymbol{x}_{i}\right\}, \boldsymbol{\beta}\right) = \prod_{i=1}^n P_{}\left(y_{i} |\boldsymbol{x}_{i}, \boldsymbol{\beta}\right)$$$ where $$\left\{y_{i}, i = 1,...,n\right\}$$ are response data given the $$\left\{\boldsymbol{x}_i, i = 1,...,n\right\}$$.

The Natural/Canonical Link. Recall that we have both: \begin{alignat}{4} \mu & = {\mathrm E}[Y |\theta, \phi] && = b'(\theta) \tag{1.1} \\ \mu & = {\mathrm E}[Y |\boldsymbol{x}, \boldsymbol{\beta}] && = h(\boldsymbol{\beta}^T\boldsymbol{x}) = h(\eta) \tag{1.2} \end{alignat} with Equation (1.1) holding as a result of $$P_{}\left(y |\theta, \phi\right)$$ following an EDF distribution, and Equation (1.2) holding by definition for a GLM.

The natural link is the choice $$h = b'$$, or equivalently $$g = (b')^{-1}$$, resulting in the equation $$$\theta = \boldsymbol{\beta}^T\boldsymbol{x} = \eta.$$$