# An Introduction to Statistical Learning with the tidyverse and tidymodels

*2022-04-09*

# Who, what, and why?

I am a data scientist and statistician who is (mostly) self-taught from textbooks and generous people sharing their work online. Inspired by projects like Solomon Kurz’s recoding of Statistical Rethinking, I decided to publicly document my notes and code as I work through An Introduction to Statistical Learning, 2nd edition by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

I prefer to work with the `tidyverse`

collection of R packages, and so will be using those to wrangle and visualize the data.
Along the way, I’ll be teaching myself the `tidymodels`

framework for machine learning.
In general, my plan for each chapter/concept is to start with the original modeling package, then move towards the `tidymodels`

approach in the labs and exercises.
For example, I’ll first perform logistic regression with `glm()`

, then use `parsnip::logistic_reg()`

by the end of the chapter.
I think this will help me better appreciate the unified interface provided with `tidymodels`

, and maybe help me better understand what is going on under the hood.

I won’t be doing every exercise or section. My main goal for this project is to improve my statistical programming, so I will focus on the applied exercises rather than the conceptual.

As of 2022-04-09, I’ve completed Chapters 1 through 6.