Part 2: Programming basics

This part introduces basic concepts of computer programming. In our introduction, we quoted Frank Harrell (Section 1.2.3.1):

Can one be a good data analyst without being a half-good programmer?
The short answer to that is, ‘No.’
The long answer to that is, ‘No.’

Frank Harrell (1999), S-PLUS User Conference, New Orleans

and remarked that the notion of a “half-good programmer” remains somewhat vague. Essentially, this part provides an essential programming curriculum for new data analysts. This curriculum contains three chapters:

  • Chapter 4 discusses conditionals for verifying data and distinguishing between cases. After introducing basic and advanced conditionals, we will see that we were using conditional data structures in Chapters 2 and  3.

  • Chapter 5 enables us to create our own functions. By providing a powerful tool for abstraction and modularization, writing functions will advance our programming skills by a crucial step.

  • Chapter 6 introduces iteration for executing parts of code repeatedly. In R, iteration can be explicit or implicit. Explicit iteration uses loops that use its for, while, or repeat structures. Implicit iteration uses vectorized functions or the families of base R apply() or purrr map() functions to directly apply functions to data structures.

Although a basic familiarity with these concepts and contents will not make us an expert programmer, they lay a foundation on which we can build more sophisticated programming skills later.