## 7.2 Lasso

Lasso stands for “least absolute shrinkage and selection operator”. Like ridge, lasso adds a penalty for coefficients, but instead of penalizing the sum of squared coefficients (L2 penalty), lasso penalizes the sum of absolute values (L1 penalty). As a result, for high values of $$\lambda$$, coefficients can be zeroed under lasso.

The loss function for lasso is

$L = \sum_{i = 1}^n \left(y_i - x_i^{'} \hat\beta \right)^2 + \lambda \sum_{j=1}^k \left| \hat{\beta}_j \right|.$

#### Example

Continuing with prediction of mpg from the other variables in the mtcars data set, follow the same steps as before, but with ridge regression. This time specify parameter alpha = 1 for ridge regression (it was 0 for ridge, and for elastic net it will be something in between and require optimization).

set.seed(1234)
mdl_lasso <- train(
mpg ~ .,
data = training,
method = "glmnet",
metric = "RMSE",
preProcess = c("center", "scale"),
tuneGrid = expand.grid(
.alpha = 1,  # optimize a lasso regression
.lambda = seq(0, 5, length.out = 101)
),
trControl = train_control
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
mdl_lasso\$bestTune
##    alpha lambda
## 14     1   0.65

The summary output shows the model did not tune alpha because I held it at 1 for lasso regression. The optimal tuning values (at the minimum RMSE) were alpha = 1 and lambda = 0.65. You can see the RMSE minimum on the the plot.

ggplot(mdl_ridge) +
labs(title = "Lasso Regression Parameter Tuning", x = "lambda")