Resources

We conclude each chapter with links to additional resources. In this introduction, these are pointers to the materials and software requirements of this book, as well as related resources on R (R Core Team, 2024) and the tidyverse (Wickham et al., 2019).

This book and course

Resources related to this book and course at the University of Konstanz, 2023:

uni.kn

Textbook

The textbook for this course is Data Science for Psychologists (Neth, 2023):

Some topics not covered here contain pointers to R for Data Science (Wickham & Grolemund, 2017) and its 2nd edition (Wickham, Çetinkaya-Rundel, & Grolemund, 2023).

Software and packages

Working through this book assumes an installation of three types of software programs:

  1. An R engine: The R project for statistical computing is the origin of all things R. A current distribution of R — e.g., R version 4.4.1 (2024-06-14) — for your machine can be downloaded from one if its mirrors.

  2. An R interface: The RStudio IDE (by Posit) provides an integrated development environment (IDE) for R.6

  3. Additional tools: The R packages of the tidyverse (Wickham, 2023) and ds4psy (Neth, 2023).

Language references

As the R language, every R package, and every R function, are extensively documented, the best strategy to answer a question is to consult an official source of reference (rather than doing an internet search). While the official references of the R language can initially be intimidating, they are the most authoritative and often the fastest way of finding answers:

  • Most questions concerning details of R can be settled by reading the R Language Definition that is available from the Help page of any R system.

  • The details of particular functions are best resolved by studying the function’s documentation. For a function named foo, its documentation can be shown by evaluating ?foo. Even when some of the documentation may be hard to understand, working through the Examples is usually helpful.

  • For a general collection of materials and scripts, see The R Manuals and other documentation. Corresponding links are provided on the main Help page of any R system.

The RStudio IDE

The distinctions between R, R packages, and RStudio are somewhat confusing at first and will be explained in more detail in Chapter 1: Basic R concepts and commands (see Section 1.1.3). At this point, it is good to know that we can interact with R and manage our R packages within the RStudio IDE. Given the large variety of functions and levels, this interface is divided into many sub-windows that can be arranged and expanded in various ways. To get started, we only need to distinguish between the main Editor window (typically located on the top left), the Console (for entering R commands), and a few auxiliary windows that may display outputs (e.g., a Viewer for showing visualizations) and provide information on our current Environment or the Packages available on our computer. A useful window is Help: Although its main page provides mostly links to online materials, any R package contains detailed documentations on and examples of its functions that can be browsed in this window.

Figure 0.4 shows the Posit cheatsheets on the RStudio IDE and illustrates that there are dozens of other functions available. As you get more experienced, you will discover lots of nifty features and shortcuts. Especially foldable sections and keyboard shortcuts (see Alt + Shift + K for an overview) can make your life in R a lot easier. But don’t let the abundance of options overwhelm you — I have yet to meet a person who needs or uses all of them.

The RStudio IDE (from Posit cheatsheets).

Figure 0.4: The RStudio IDE (from Posit cheatsheets).

A useful feature of the RStudio IDE is that collections of files can be combined into projects. For instance, it makes sense to store everything related to this course in a dedicated directory on your hard drive (e.g., in a folder “ds4psy”) and create an RStudio project (also named ds4psy) that uses this directory as its root. An immediate benefit of using projects is that your entire workflow gets more organized.7

R Markdown

R Markdown allows weaving text and code into reproducible research documents. For quick instructions on combining text and code, see Appendix F, or read the more detailed introduction of Chapter 27: R Markdown of the r4ds textbook. Alternatively, just start with one of the following templates:

  • minimal template: rmd_template_s [in .Rmd | .html format]

  • medium template: rmd_template_m [in .Rmd | .html format]

  • explicit explanations: Rmarkdown_basics [in .Rmd | .html format]

A typical R Markdown document consists of three distinct parts:

  1. A header for setting global document options;
  2. Text that may contain headings, paragraphs, and itemized lists; and
  3. Code chunks that contain and evaluate R code.

When using R Markdown (typically saved as with the file extension .Rmd), you can generate various output formats to show and transfer your work. I recommend generating output documents in HTML format (i.e., .html files), as they can easily be exchanged and shown on most devices and platforms.

Fortunately, the range of commands required to benefit from R Markdown is very limited. For instance, the commands in the help file Help > Markdown Quick Reference of RStudio provide a good start for creating beautiful and functional documents. Beyond these basics, the R Markdown Cheatsheet — also available in RStudio by selecting Help > Cheatsheets > R Markdown Cheat Sheet — provides a more comprehensive overview of R Markdown functionality and commands:

R Markdown cheatsheet (from Posit cheatsheets).

Figure 0.5: R Markdown cheatsheet (from Posit cheatsheets).

Other books

This book and course were originally based on R for Data Science (Wickham & Grolemund, 2017). The contents of this book are more general and more tidyverse-centric, but quickly becoming a classic:

  • Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Sebastopol, Canada: O’Reilly Media, Inc.  [Available at http://r4ds.had.co.nz.]

The updated and expanded 2nd edition is:

  • Wickham, H., Cetinkaya-Rundel, M. & Grolemund, G. (2023). R for data science: Import, tidy, transform, visualize, and model data (2nd edition). Sebastopol, Canada: O’Reilly Media, Inc.  [Available at https://r4ds.hadley.nz/.]

The ebook R for data science: Exercise solutions (by Jeffrey B. Arnold) provides exercise solutions to the exercises in r4ds.

There are many other excellent books (and even more fragmentary and bad books) on data science in R for various audiences. Here are some recommendations for finding additional texts and courses on learning data science or statistics with R:

  • Bookdown.org is a major catalyst for data science in R, as it provides many great books on various topics at no charge. The archive page contains books on an even wider selection of topics. Due to the grass-roots nature of the site, many books are unfinished and of low quality. However, there are also many excellent ones. Some easy recommendations include:

Statistics with R

R does many things beyond statistics. But as R was designed as a programming language for statistics, many textbooks approach R from this angle. Available examples include:

Web sites and blogs

Online information on R is abundant, but can be hard to navigate. Useful starting points include:

  • Intro2R provides a gentle 3-day introduction to R.

  • Quick-R (by Robert Kabacoff) is a popular website on R programming that also provides many pointers for using R in statistics.

  • R-bloggers collects blog posts on R.

  • The Simply statistics blog (by Rafa Irizarry, Roger Peng, and Jeff Leek) provides insightful and inspiring articles on many data science topics.

  • The Win vector blog (by John Mount and Nina Zumel) provides noteworthy observations on particular problems and data science in general.

  • The Learning Machines blog (by Holger K. von Jouanne-Diedrich) contains many readworthy articles on using R for modeling and machine learning.

  • Towards data science provides background articles on current data science issues.

Educational resources

Other R courses and exercises include:

Miscellaneous

Other helpful links that do not fit into the above categories include:


ds4psy

[index.Rmd updated on 2024-12-20 by hn.]

References

Allaire, J., Xie, Y., Dervieux, C., McPherson, J., Luraschi, J., Ushey, K., … Iannone, R. (2024). rmarkdown: Dynamic documents for R. Retrieved from https://github.com/rstudio/rmarkdown
Neth, H. (2023). ds4psy: Data science for psychologists. https://doi.org/10.5281/zenodo.7229812
R Core Team. (2024). R base: A language and environment for statistical computing. Retrieved from https://www.R-project.org
Wickham, H. (2014a). Advanced R (1st ed.). Retrieved from http://adv-r.had.co.nz/
Wickham, H. (2015). R packages: Organize, test, document, and share your code. Retrieved from http://adv-r.had.co.nz/
Wickham, H. (2019). Advanced R (2nd ed.). Retrieved from https://adv-r.hadley.nz/
Wickham, H. (2023). tidyverse: Easily install and load the ’tidyverse’. Retrieved from https://tidyverse.tidyverse.org
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science (2nd ed.). Retrieved from https://r4ds.hadley.nz
Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Retrieved from http://r4ds.had.co.nz
Xie, Y. (2024b). knitr: A general-purpose package for dynamic report generation in R. Retrieved from https://yihui.org/knitr/

  1. Installing RStudio typically provides many additional R packages. Two packages we will use extensively are knitr (Xie, 2024b) and rmarkdown (Allaire et al., 2024).↩︎

  2. See the introductory chapters of R for Data Science (Wickham & Grolemund, 2017) for short, but helpful instructions on organizing your workflow with RStudio — especially the even-numbered chapters basics (Chapter 4), scripts (Chapter 6), and projects (Chapter 8).↩︎

  3. Disclaimer: When first starting to teach this course, I inherited its materials from Nathaniel. See Rpository.com/learnR/ for a course with corresponding exercises and solutions.↩︎