12.5 Exercises

ds4psy: Exercises on Iteration (12)

Here are some basic exercises on loops and applying functions to data structures:

12.5.1 Exercise 1

Fibonacci loop and functions

  1. Look up the term Fibonacci numbers (e.g., on Wikipedia) and use a for loop to create a numeric vector of the first 25 Fibonacci numbers (for a series of numbers starting with 0, 1).

Hint: The series of Fibonacci numbers was previously introduced in our discussion of recursion (see Section 11.4.1). We are now looking for an iterative definition, but the underlying processes are quite similar. Essentially, the recursive definition resulted in an implicit loop, whereas we now explicitly define the iteration.

  1. Incorporate your for loop into a fibonacci() function that returns a numeric vector of the first n Fibonacci numbers. Test your function for fibonacci(n = 25).

  2. Generalize your fibonacci() function to also accept the first two elements (e1 and e2) as inputs to the series and then create the first n Fibonacci numbers given these initial elements. Test your function for fibonacci(e1 = 1, e2 = 3, n = 25).

12.5.2 Exercise 2

Looping for divisors

  1. Write a for loop that prints out all positive divisors of the number 1000.

Hint: Use N %% x == 0 to test whether x is a divisor of N.

  1. How many iterations did your loop require? Could you achieve the same results with fewer iterations?

  2. Write a divisors() function that uses a for loop to return a numeric vector containing all positive divisors of a natural number N.

Hint: Note that we do not know the length of the resulting vector.

  1. Use your divisors() function to answer the question: Does the number 1001 have fewer or more divisors than the number 1000?

  2. Use your divisors() function and another for loop to answer the question: Which prime numbers exist between the number 111 and the number 1111?

Hint: A prime number (e.g., 13) has only two divisors: The number 1 and the number itself.

12.5.3 Exercise 3

Let’s revisit our favorite randomizing devices one more time:

  • In Chapter 1, we first explored the ds4psy functions coin() and dice() (see Section 1.6.4 and Exercise 3 in Section 1.8.3).

  • In Exercise 4 of Chapter 11 (see Section 11.6.4), we wrote my_coin() and my_dice() functions by calling either these ds4psy functions or the base R sample() function.

  • In this exercise, we will use for and while loops to repeatedly call an existing function.

Throwing dice in loops

  1. Implement a function my_dice() that uses the base R function sample() to simulate a throw of a dice (i.e., yielding an integer from 1 to 6 with equal probability).

  2. Add an argument N (for the number of throws) to your function and modify it by using a for loop to throw the dice N times, and returning a vector of length N that shows the results of the N throws.

Hint: This task corresponds to Exercise 4 of Chapter 11 (see Section 11.6.4).

  1. Use a while loop to throw my_dice(N = 1) until you throw the number 6 twice in a row and show the sequence of all throws up to this point.

Hint: Given a sequence throws, the i-th element is throws[i]. Hence, the last element of throws is throws[length(throws)].

  1. Use your solution of 3. to conduct a simulation that addresses the following question:
  • How many times on average do we need to throw my_dice(1) to obtain the number 6 twice in a row?

Hint: Use a for loop to run your solution to 3. for T = 10000 times and store the length of the individual throws in a numeric vector.

Disclaimer

This exercise shows how loops can be used to generate and collect multiple outputs. This can sometimes replace vector arguments to functions. However, as R is optimized for vectors, using loops rather than vectors is not generally recommended.

12.5.4 Exercise 4

Mapping functions to data

Write code that uses a function of the base R apply or purrr map family of functions to:

  1. Compute the mean of every column in mtcars.
  2. Determine the type of each column in ggplot2::diamonds.
  3. Compute the number of unique values in each column of iris.
  4. Generate 10 random normal numbers for each of μ = −100, 0, and 100.

Note: This exercise is based on Exercise 1 of Chapter 21.5.3 in r4ds.

12.5.5 Exercise 5

Z-transforming tables

In this exercise, we will standardize an entire table of data (using a for loop, an apply(), and a map() function). We will first write a utility function that achieves the desired transformation for a vector and then compare and contrast different ways of applying this function to a table of data.

In case you are not familiar with the notion of a z score or standard score, look up these terms (e.g., on Wikipedia).

  1. Write a function called z_trans that takes a vector v as input and returns the z-transformed (or standardized) values as output if v is numeric and returns v unchanged if it is non-numeric.
    Hint: Remember that z <- (v - mean(v)) / sd(v)), but beware that v could contain NA values.

  2. Load the dataset for the false positive psychology (see Section B.2 of Appendix B) into falsePosPsy and remove any non-numeric variables from it.

# Load data: 
falsePosPsy <- ds4psy::falsePosPsy_all  # from ds4psy package
# falsePosPsy <- readr::read_csv("http://rpository.com/ds4psy/data/falsePosPsy_all.csv")  # online

falsePosPsy
#> # A tibble: 78 × 19
#>    study    ID  aged aged365 female   dad   mom potato when64 kalimba cond   
#>    <dbl> <dbl> <dbl>   <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl>   <dbl> <chr>  
#>  1     1     1  6765    18.5      0    49    45      0      0       1 control
#>  2     1     2  7715    21.1      1    63    62      0      1       0 64     
#>  3     1     3  7630    20.9      0    61    59      0      1       0 64     
#>  4     1     4  7543    20.7      0    54    51      0      0       1 control
#>  5     1     5  7849    21.5      0    47    43      0      1       0 64     
#>  6     1     6  7581    20.8      1    49    50      0      1       0 64     
#>  7     1     7  7534    20.6      1    56    55      0      0       1 control
#>  8     1     8  6678    18.3      1    45    45      0      1       0 64     
#>  9     1     9  6970    19.1      0    53    51      1      0       0 potato 
#> 10     1    10  7681    21.0      0    53    51      0      1       0 64     
#> # … with 68 more rows, and 8 more variables: root <dbl>, bird <dbl>,
#> #   political <dbl>, quarterback <dbl>, olddays <dbl>, feelold <dbl>,
#> #   computer <dbl>, diner <dbl>
  • Use an appropriate map function to to create a single vector that — for each column in falsePosPsy — indicates whether or not it is a numeric variable?

Hint: The function is.numeric() tests whether a vector is numeric.

  • Use this vector to select only the numeric columns of falsePosPsy into a new tibble fpp_numeric:

  • Use a for loop to apply your z_trans() function to fpp_numeric to standardize all of its columns:

  • Turn your resulting data structure into a tibble out_1 and print it.

  1. Repeat the task of 2. (i.e., applying z_trans() to all numeric columns of falsePosPsy) by using the base R apply function, rather than a for loop. Save and print your resulting data structure as a tibble out_2.

Hint: Remember to set the MARGIN argument to apply z_trans() over all columns, rather than rows.

  1. Repeat the task of 2. and 3. (i.e., applying z_trans() to all numeric columns of falsePosPsy) by using an appropriate version of a map function from the purrr package. Save and print your resulting data structure as a tibble out_3.

Hint: Note that the desired output structure is a rectangular data table, which is also a list.

  1. Use all.equal to verify that your results of 2., 3. and 4. (i.e., out_1, out_2, and out_3) are all equal.

Hint: If a tibble t1 lacks variable names, you can add those of another tibble t2 by assigning names(t1) <- names(t2).

12.5.6 Exercise 6

Cumulative savings revisited

In Exercise 2 of Chapter 1: Basic R concepts and commands, we computed the cumulative sum of an initial investment amount a = 1000, given an annual interest rate of int of .1%, and an annual rate of inflation inf of 2%, after a number of n full years (e.g., n = 10):

# Task parameters: 
a <- 1000      # initial amount: $1000
int <- .1/100  # annual interest rate of 0.1%
inf <- 2/100   # annual inflation rate 2%
n   <- 10      # number of years

Our solution in Chapter 1 consisted in an arithmetic formula which computes a new total based on the current task parameters:

# Previous solution (see Exercise 2 of Chapter 1): 
total <- a * (1 + int - inf)^n
total
#> [1] 825.4487

Given our new skills about writing loops and functions (from Chapter 11), we can solve this task in a variety of ways. This exercise illustrates some differences between loops, a function that implements the formula, and a vector-based solution. Although all these approaches solve the same problem, they differ in important ways.

  1. Write a for loop that iteratively computes the current value of your investment after each of 1:n years (with \(n \geq 1\)).

Hint: Express the new value of your investment a as a function of its current value a and its change based on inf and int in each year.

  1. Write a function compute_value() that takes a, int, inf, and n as its arguments, and directly computes and returns the cumulative total after n years.

Hint: Translate the arithmetic solution (shown above) into a function that directly computes the new total. Use sensible default values for your function.

  1. Write a for loop that iteratively calls your function compute_value() for every year n.

  2. Check whether your compute_value() function also works for a vector of year values n. Then discuss the differences between the solutions to Exercise 6.1, 6.3, and 6.4.

This concludes our basic exercises on loops and applying functions to data structures.