A summary of the fitted model

The Simple Linear Regression

Suppose we have one response variable $y$ and an explanatory variable $x$ and two models as follows

Data: $(y_i,x_{i}),\quad i=1,\dots,n$

Model 0: $E(y_i) = \alpha$

Model 1: $E(y_i) = \alpha+\beta x_{i}$

In the case of simple linear regression with only one explanatory variable, this compares a line that slopes through the data (Model 1) with a line that runs through the data but lies parallel to the horizontal axis (Model 0).

In order to fit Model 0 to the data, that is estimate the parameters in this model through least squares, we use

$S(\alpha) = \sum_{i=1}^n(y_i-\alpha)^2$ $\hat{\alpha} = \bar{y}.$

as illustrated in the top left hand side plot below. Therefore, the residual sum-of-squares for Model 0 is: $\begin{aligned} S(\hat{\alpha}) &= \sum_{i=1}^n(y_i-\bar{y})^2 \\ &= S_{yy} \\ \end{aligned}$ corresponding to the top right hand side plot below.

Denote the residual sum-of-squares for Model 1 as

$\begin{aligned} S(\hat{\alpha}, \hat{\beta}) &= \sum_{i=1}^n(y_i-\{\hat{\alpha}+\hat{\beta} x_{i}\})^2 \\ &= \sum_{i=1}^n(y_i-\hat{y}_i)^2 \end{aligned}$

Recall that we are calculating the distances between the observed values $y_1,\ldots,y_n$ to fitted values $\hat{y}_1,\ldots,\hat{y}_n$ corresponding to bottom left hand side plot above.

For completeness we can also look at the difference between fitted values obtained from Model 0 and those obtained from Model 1

$\sum_{i=1}^{n}(\bar{y} - \hat{y}_i)^2$

corresponding to bottom right hand side plot above.

Sums of Squares

The residual sum of squares of Model 0 is referred to as the Total corrected sum of squares $TSS$ and the sum of squares between the fitted values obtained from Model 0 and Model 1 is referred to as the Model sum of squares $MSS$ . The three values $RSS$ , $MSS$ and $TSS$ are related such that $TSS=MSS+RSS.$

Coefficient of Determination $R^2$

In our discussion of least squares, the residual sum-of-squares for a particular model was proposed as a numerical measure of how well the model fits the data. This leads to a natural measure of how much variation in the data our model has explained, by comparing $RSS$ with $TSS$ . A simple but useful measure of model fit is given by

$R^2 = 1-\frac{RSS}{TSS}$

where $RSS$ is the residual sum-of-squares for Model 1, the fitted model of interest; and $TSS = \sum_{i=1}^n(y_i-\bar{y})^2 = S_{yy}$ , the residual sum of squares of the null model. Since Model 0 is more restricted, it will always produce a larger residual sum-of-squares. That is $TSS > RSS$ .

$R^2$ quantifies how much of a drop in the residual sum-of-squares is accounted for by fitting the proposed model, and is often referred to as the coefficient of determination. This is expressed on a helpful scale, as a proportion of the total variation in the data.

Values of $R^2$ approaching 1 indicate the model to be a good fit.
Values of $R^2$ less than 0.5 suggest that the model gives an OK fit to the data.
Working with real data, we often observed very small $R^2$ values less than 0.5.

In the case of simple linear regression

Model 1: $E(y_i) = \alpha+\beta x_i$

$R^2=r^2$

where $R^2$ is the coefficient of determination and $r$ is the sample correlation coefficient. To show this recall

$RSS = \sum_{i=1}^n (y_i-(\hat{\alpha}+\hat{\beta} x_i))^2 = S_{yy}-\frac{(S_{xy})^2}{S_{xx}}$

$\begin{aligned} RSS &= \sum_{i=1}^n (y_i-\{\hat{\alpha}+\hat{\beta} x_i\})^2 \\ &= S_{yy}-\frac{(S_{xy})^2}{S_{xx}} \\ R^2 &= 1-\frac{RSS}{TSS}\\ &=1-\frac{\sum_{i=1}^n({y_i}-\hat{y}_i)^2}{\sum_{i=1}^n(y_i-\bar{y})^2}\\ &=\frac{S_{yy}-(S_{yy}-\frac{(S_{xy})^2}{S_{xx}})}{S_{yy}}\\ &=\frac{(S_{xy})^2}{S_{xx}S_{yy}} \\ &= r^2 \end{aligned}$

Hence $R^2 = r^2$ i.e. the coefficient of determination is the squared sample correlation; in the case of simple linear regression. This result does not extend to the multiple linear regression.

Nested Models

In the case of the simple linear regression

Model 0: $E(y_i) = \alpha$

Model 1: $E(y_i) = \alpha+\beta x_{i}$

these models are nested. By setting $\beta=0$ in Model 1 we retrieve Model 0. In other words, the simpler Model 0 is a special case of the more complex Model 1.

In the case of simple linear regression through the origin

Model 0: $E(y_i) = \alpha$

Model 1: $E(y_i) = \beta x_{i}$

the formula for $R^2$ , with $TSS = \sum_i (y_i-\bar{y})^2$ , cannot be used. The fitted Model 1 and Model 0 are not nested.

The Multiple Linear Regression

Suppose now we have one response variable $y$ and $p$ explanatory variable $x_1, \ldots x_p$ and two models as follows

Data: $(y_i,x_{1i}, \ldots, x_{(p-1)i}),\quad i=1,\dots,n$

Model 0: $E(y_i) = \alpha$

Model 1: $E(y_i) = \alpha+\beta_1 x_{1i} + \ldots + \beta_{k} x_{ki}$

Then we can calculate the coefficient of determination $R^2$ is the same way. However, in the case of multiple linear regression, where there is more than one explanatory variable in the model, we often refer to a quantity called adjusted $R^2$ , $R^2$ (adj), instead of $R^2$ . As the number of explanatory variables increases, $R^2$ also increases, but $R^2$ (adj) adjusts for the fact that there is more than one explanatory variable the model.

$R^2$ (adj) as a measure of model fit

For any multiple linear regression $E(y_i) = \alpha+\beta_1x_{1i}+\dots+\beta_{(p-1)}x_{(p-1)i}$ the $R^2$ (adj) is defined as $R^2 \mbox(adj) = 1-\frac{\frac{RSS}{n-k-1}}{\frac{TSS}{n-1}},$ where $k$ is the number of explanatory variables, i.e. the number of coefficients in the model excluding the constant $\alpha$ term.

$R^2$ (adj) can also be calculated from the following identity

$R^{2} \mbox{(adj)} ={1-(1-R^{2}){n-1 \over n-k-1}}.$