1.2 Datasets

To seamlessly use the code in the text, create an R project and a folder called “Data” in the same location as your R project. For more information on R projects, see “Workflow: scripts and projects” in R for Data Science (H. Wickham, Çetinkaya-Rundel, and Grolemund 2023). In the Data folder, place the teaching datasets listed below. Descriptions, including instructions for downloading and processing these datasets, can be found in Appendix A. In some cases, the datasets are provided as-is; in other cases, you will need to download and run some R code to create them.

  • NHANES (2017-2018)
  • United Nations Human Development Data (2020)
  • U.S. Natality (2018)
  • COVID-19 county-level data
  • NSDUH (2019)
  • Framingham Heart Study (BioLINCC teaching dataset)
  • CAMP (BioLINCC teaching dataset)
  • Digitalis (BioLINCC teaching dataset)
  • Opioid

NOTE: The datasets are meant for teaching, not research. The analyses herein are meant solely as teaching examples illustrating the use of regression methods using R. Results found in this text, and datasets provided with this text, should not be used to draw conclusions about any health conditions or relationships between variables.

References

Wickham, H., M. Çetinkaya-Rundel, and G. Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2nd ed. Sebastopol, CA: O’Reilly Media.