This chapter provides essential background knowledge: It first introduces the terminology and technology of the R programming language, as well as basic distinctions of data types and shapes that exist in most programming environments. The majority of the chapter covers fundamental concepts and commands of base R. Essentially, this chapter contains a brief crash course on R that lays the foundations for the rest of the book.
R is both a programming language and an environment for statistical computing (R Core Team, 2024). To study R, we first need to install some software on a computer and introduce some terminology for talking about data and code. As this is not an R textbook, we introduce these concepts and commands only briefly and in a playful fashion — by providing examples, interpreting code outputs, and completing a few exercises. Strictly speaking, knowing some R is not a necessary precondition for reading and learning data science with this book (or the r4ds textbook by Wickham & Grolemund, 2017). However, having encountered certain terms and various base R commands before is helpful — partly to appreciate later how various tidyverse commands let you solve some problems in simpler or more robust ways.
As students come with various levels of experience, this chapter also serves to level out background differences. Anyone who has used R before can either skip this chapter or use it to re-familiarize themselves with basic concepts and commands of R.9 Novices should carefully work through the examples, aim to understand them, and try to solve the corresponding exercises (in Section 1.8). But do not despair or panic if anything seems cryptic or obscure at this point. Instead, make a mental note of the command or task that remained unclear and trust that things will clear up in subsequent chapters. It is quite likely that we will later re-visit these commands or learn to solve similar tasks in different ways.