Linear regression for prediction
Learning outcomes/objective: Learn…
- …repeat how the linear regression model works.
- …in which situations we can use it for predictions.
- …how we use it as a ML model within R.
Sources: #TidyTuesday and tidymodels
1 Regression vs. classification
- See introductory session.
2 Linear model
2.1 Linear model (Equation) (1)
- Linear Model = LM = Linear regression model
- Aim (normally)
- Model (also understand) relationship between outcome variable (output) und 1+ explanatory variables (features)
- But very popular machine learning model as well!
- Q: How do we also call
, and ?
2.2 Linear model (Equation) (2)
Q: Why is the linear model called “linear” model?
Important: Variable values (e.g.,
or ) vary, parameter values (e.g., ) are constant across rowsImportant:
varies across units
Name | ||||||||
---|---|---|---|---|---|---|---|---|
Samuel | 8 | ? | ? | 0 | ? | 7 | ? | ? |
Ruth | 4 | ? | ? | 0 | ? | 3 | ? | ? |
William | 5 | ? | ? | 1 | ? | 2 | ? | ? |
.. | .. | .. | .. | .. | .. | .. | .. | .. |
2.3 Linear model (Visualization)
- Figure 1 visualizes the distribution of our data and a linear model that we fit to the data
- Lifesatisfactioni = b0 + b1Unemployedi + b1Educationi +
i (Wikipedia) - The plane in Figure 1 is not exact model of the data
- Admissible model must be consistent with all the data points
- Plane cannot be model, unless it exactly fits all the data points
- Hence, error term,
i, must be included in the model equation, so that it is consistent with all data points
- Predictive accuracy: How well does our model predict observations (in the test dataset)?
- Calculate average error across all errors
i (in the test dataset)
- Calculate average error across all errors
2.4 Linear model (Estimation)
- Estimation = Fitting the model to the data (by adapting/finding the parameters)
- e.g. easy in case of the mean (analytical) but more difficult e.g. for linear (or other) model(s)
- Modellparameter:
, and - Ordinary Least Squares (OLS)
- Least squares methods (Astronomy)
- Choose
, and (= plane) so that the sum of the squared errors is minimized (See graph!) - Q: Why do we square the errors?
2.5 Linear model (Prediction)
2.6 Linear model: Accuracy (MSE, RMSE, R-squared)
- Mean squared error (James et al. 2013, Ch. 2.2)
(James et al. 2013, Ch. 2.2.1) is s true outcome value is the prediction that gives for the th observation- MSE is small if predicted responses are to the true responses, and large if they differ substantially
- Training MSE: MSE computed using the training data
- Test MSE: How is the accuracy of the predictions that we obtain when we apply our method to previously unseen test data?
: the average squared prediction error for test observations- Usually, when building a model we used a third dataset to assess accuracy, i.e., analysis (training) data, assessment (validation) data and test data
- Fundamental property of ML (cf. James et al. 2013, 31, Figure 2.9)
- As model flexibility increases, training MSE will decrease, but the test MSE may not (danger of overfitting)
- In practice we use the Root Mean Squared Error (RMSE)
- MSE is expressed in squared units and not directly comparable to the target/outcome variable
- RMSE takes the square root of the MSE and brings the units back to the original scale of the outcome variable
- RMSE is more interpretable and comparable across models/datasets
- R-squared measures the proportion of variance in the outcome variable that is explained by the model
- Ranges from 0 to 1, where 0 means the model explains none of the variance and 1 means the model explains all of the variance
- Calculated as the ratio of the explained variance to the total variance of the outcome variable (measure of the model’s goodness of fit)