11.3 Essentials of conditionals

Whereas most of our scripts so far relied on being executed linearly (in a top-down, left-to-right, line-by-line fashion), using functions implies jumping around in large amounts of code. Strictly speaking, we have also been using statements that were parsed from right to left (e.g., assignments like x <- 1) or bottom-to-top (e.g., when assigning a multi-line pipe of dplyr statements to an object). Also, given that we have been using functions all along, we really have been jumping around in base R code since our very first session.

This section addresses a special type of flow control: When thinking about the flow of information through a program (which can be a single R function, or an entire system of R packages), we often come to junctions at which we want to say: Based on some criterion being met, we either want to do this, or, if the criterion is not met, do something else. Such junctions are an important feature of any programming language and are typically handled by special functions, special control structures, or conditionals (i.e., “if-then” or “if-then-else” statements).

11.3.1 Flow control

Creating functions often requires controlling the flow of information within the body of a function. We can distinguish between several ways how this can be achieved:

  • Special functions (e.g., like return, print, or stop) can cause side-effects or skip code (e.g., by exiting the function).

  • Functions often incorporate iteration and loops, which are covered in the next chapter (i.e., Chapter 12 on Iteration).

  • Testing input arguments or distinguishing between several cases requires the conditional execution of code, discussed in this section. In the definition of describe() above, we have seen that functions frequently require checking some properties of its inputs, distinguishing between cases, and controlling the flow of data processing based on test results. This is the job of conditional statements, which exist in many different forms. In this section, we only cover the most essential types.

11.3.2 If-then

A conditional statement conducts a test (which evaluates to either TRUE or FALSE) and executes additional code based on the value of the test. The simplest conditional in R is the if function, which implements the logic of if-then in the following if (test) {...} structure:

Here, test must evaluate to a single Boolean value (i.e., either TRUE or FALSE). If test is TRUE the code in the subsequent {...} is executed (here: "ok" is printed to the Console) – otherwise the code in the subsequent {...} is skipped, as if it was not there or commented out:

Note that if test is a Boolean value, we do not need to ask for the condition test == TRUE.

11.3.3 If-then-else

If a test fails, we often want something else to happen. To accommodate this desire, a slightly more complicated form of if statement includes an additional {...} after an else statement:

Here, the truth value of test determines whether the 1st or the 2nd {...} is executed. As test must be either TRUE or FALSE, we either see “case 1” printed (if test is TRUE) or “case 2” printed (if test is FALSE).

The following sequence illustrates how tests work (and can fail to work):

11.3.4 Vectorized ifelse

A crucial limitation of R’s basic if statement is that its test only assumes a single TRUE of FALSE as its output. However, when writing functions, we often want to make them work with vectors of input values, rather than a single input. Testing multiple values at once is possible with the ifelse(test, yes, no) function that uses vectorized test, yes, and no arguments (which are recycled to the same length):

Note that the yes, and no values used with ifelse should typically be of the same type, and NA values remain NA:

11.3.5 More complex tests

The condition test of a conditional statement can contain multiple tests. If so, each individual test must evaluate to either TRUE or FALSE and the different tests are linked with && or ||, which work like the logical connectors & and |, but are evaluated sequentially (from left to right):

Example

Here’s a way to fix our problem from above (i.e., evaluating “grandmother” as “male”) by implementing a more comprehensive test:

A vectorized version of this if-then-else statement can be written with ifelse(), but will still mis-classify anything not considered when designing the test (e.g., stepmothers, broomsticks, etc.):

More cases

As we can replace any {...} in a conditional statement if (test) {...} else {...} by another conditional statement, we can distinguish more than 2 cases:

Here, 2 cases are contingent on their corresponding condition being TRUE, otherwise the final {...} is reached and "else" is being printed. Thus, an “else case” often serves as a generic case that occurs when none of the earlier tests are true.

Note that the following variant of this conditional is different:

Here, the final {...} is contingent on another test_3 being TRUE. Thus, the conditions that the final "else" is being printed are not only that test_1 and test_2 are both FALSE but also that test_3 is TRUE. If all 3 tests fail, none of the cases is reached and nothing is printed.

Note

  • When a test evaluates to TRUE, the corresponding {...} is evaluated and any later instances of test and {...} are skipped. Thus, only a single case of {...} is evaluated, even if multiple tests would evaluate to TRUE.

11.3.6 Switch

A useful alternative to overly complicated if statements is switch, which selects one of a list of alternatives:

Here, op was specified as a character variable. If switch is used with a numeric expression i, it selects the i-th case (with i being coerced into an integer). For example:

Practice

Let’s practice what we have learned about conditionals in R.

A conditional nursery rhyme

Consider the following check_flow function:

The function appears to implement some nursery rhyme, but is really messy, unfortunately.51 Hence, we need to clean up this code before we can even begin with trying to understand the function.

  1. Format the function so that it becomes easier to read and parse.

A possible solution would indent commands, place any } on a new line, and generally introduce lots of white space, as follows:

  1. Describe and try to understand this function. What does it do and how does it do it?

  2. Answer and predict the results of the following questions:

    • Which cases does the 1st conditional statement distinguish?
    • When is the 1st switch statement reached? When is the 2nd switch statement reached?
    • What is the difference between the print and the return statements?
    • Under which conditions does the function return "raus bist du"?
    • What happens when you call check_flow() or check_flow(NA)?
  1. Test your predictions by evaluating the following calls of the check_flow() function:

  1. Actually, this example illustrates pretty well how the functions of students tend to look when they first start writing functions. Imagine searching for a typo in code formatted like this…