Chapter 2 Introduction
Trying to avoid math and formulas but they are needed in some places and in keeping with goal of being a bridge to more advanced texts are consistent with what you would see in those texts.
Treating regression like a black box, throw them all in and see what happens -> can lead to lots of mistakes. How is each variable coded? Are there interactions? Understanding the meaning of the terms in the model is essential to properly fitting and interpreting a regression model.
2.1 R and R Studio
- Refer to the appendix
- There are many resources for learning R programming
- Using ? to get help
- Sometimes I use base R, sometimes tidyverse() and sometimes both
- Libraries, Using ::
- Brief intro to using %>%
- Using an R project
In all the code used in the book, data are loaded from a folder called “Data” located in the same folder as my R project. If you download the data from [INSERT SITE], create an R project, and place the data there in a folder called “Data” you should be able to run all the code as-is.
List them here, but put full descriptions in the Appendix.
- 20% subset used in some cases
- Explain how smoker and income were derived (put in Appendix)
Include code for downloading data and loading into R (some of them are SAS XPT files).