## 4.2 Least squares estimates

An appropriate estimator which gives the “best” parameter estimates should be considered

Commonly used estimator is least squares estimator which originates from the OLS method (

*Ordinary Least Squares*)In bivariate regression the OLS method is concerned with finding the regression line that gives the best fit to data points such that the sum of squared residuals is small as possible, hence the term “least squares”

\[\begin{equation} \text{min}\sum_{i=1}^n \hat{u}_i^2=\sum_{i=1}^n \bigg( y_i - (\hat{\beta}_0+\hat{\beta}_1 x_i)\bigg)^2 \tag{4.8} \end{equation}\]

- Differential calculus is used to obtain \(\hat{\beta}_0\) and \(\hat{\beta}_1\) that minimize the sum of squared residuals (SSR)

\[\begin{align} \hat{\beta}_1&=\frac{\displaystyle\sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y})}{\displaystyle\sum_{i=1}^n (x_i-\bar{x})^2}=\frac{Cov(x,y)}{S_x^2} \\ \\ \hat{\beta}_0&=\bar{y}-\hat{\beta}_1 \bar{x} \\ \tag{4.9} \end{align}\]

**Example 4.1 **A real estate agent wishes to examine the relationship between the selling price of a house and its size (measured in square meters). A random sample of ten houses is selected (\(n=10\)). Insert data from table 4.3 to Excel and plot `Scattergram`

. Add a regression line to the same plot. Afterwords, compute the same OLS estimates using functions `=INTERCEPT()`

and `=SLOPE()`

. Give interpretation of both regression coefficients.

size in \(m^2\) | price in \(1000\)s USD |
---|---|

130 | 245 |

148 | 213 |

157 | 279 |

174 | 308 |

102 | 199 |

143 | 219 |

218 | 405 |

227 | 324 |

132 | 319 |

157 | 255 |

**Excel instructions**: insert the data in Excel manually and select both columns including the variables names. On the `Insert`

tab click `XY (Scatter)`

and select the firs chart subtype that doesn’t include any lines. In the next step right-click any single point inside the chart and select `Add Trendline`

from drop down options. From the `Trendline`

menu check both boxes the `Display Equation on Chart`

and the `Display R-Squared Value on Chart`

(linear equation is default). Afterwords, use functions `=INTERCEPT(known_ys;known_xs)`

and `=SLOPE(known_ys;known_xs)`

to compute OLS estimates for constant and slope. The first argument of both functions is the range of dependent variable, while the second argument is always the range of independent variable.

\[\widehat{price}_i=98.115+1.1863\cdot size_i\]

The slope \(\hat{\beta}_1=1.1863\) tell us that the average value of a house increases by \(1.1863(1000)=1186.3\) USD, for each additional one square meter of size

The constant term \(\hat{\beta}_0=98.115(1000)=98115\) USD just indicates that \(98115\) is the portion of the house price not explained by it’s size

The \(R^2\) is commonly used as a goodness of fit measure