12.4 Exercises

ds4psy: Exercises on Iteration (12)

Here are the exercises on loops and applying functions to data structures:

12.4.1 Exercise 1

Fibonacci loop and functions

  1. Look up the term Fibonacci numbers and use a for loop to create a numeric vector of the first 25 Fibonacci numbers (for a series of numbers starting with 0, 1).

  2. Incorporate your for loop into a fibonacci function that returns a numeric vector of the first n Fibonacci numbers. Test your function for fibonacci(n = 25).

  3. Generalize your fibonacci function to also accept the first 2 elements (e1 and e2) as inputs to the series and then create the first n Fibonacci numbers given these initial elements. Test your function for fibonacci(e1 = 1, e2 = 3, n = 25).

12.4.2 Exercise 2 Looping for divisors

  1. Write a for loop that prints out all positive divisors of the number 1000.

Hint: Use N %% x == 0 to test whether x is a divisor of N.

  1. How many iterations did your loop require? Could you achieve the same results with fewer iterations?

  2. Write a divisors function that uses a for loop to return a numeric vector containing all positive divisors of a natural number N.

Hint: Note that we do not know the length of the resulting vector.

  1. Use your divisors function to answer the question: Does the number 1001 have fewer or more divisors than the number 1000?

  2. Use your divisors function and another for loop to answer the question: Which prime numbers exist between the number 111 and the number 1111?

Hint: A prime number (e.g., 13) has only 2 divisors: 1 and the number itself.

12.4.3 Exercise 3

Throwing dice

  1. Implement a function dice that uses the base R function sample() to simulate a throw of a dice (i.e., yielding an integer from 1 to 6 with equal probability).

  2. Add an argument n (for the number of throws) to your function and modify it by using a for loop to throw the dice n times, and returning a vector of length n that shows the results of the n throws.

  3. Use a while loop to throw dice(n = 1) until you throw the number 6 twice in a row and show the sequence of all throws up to this point.

Hint: Given a sequence throws, the n-th element is throws[n]. Hence, the last element of throws is throws[length(throws)].

  1. Use your solution of 3. to conduct a simulation that addresses the following question:
  • How many times on average do we need to throw dice(1) to obtain the number 6 twice in a row?

Hint: Use a for loop to run your solution to 3. for N = 10000 times and store the length of the individual throws in a numeric vector.

12.4.4 Exercise 4

Mapping functions to data

Write code that uses a function of the base R apply or purrr map family of functions to:

  1. Compute the mean of every column in mtcars.
  2. Determine the type of each column in ggplot2::diamonds.
  3. Compute the number of unique values in each column of iris.
  4. Generate 10 random normal numbers for each of μ = −100, 0, and 100.

Note: This exercise is based on Exercise 1 of Chapter 21.5.3 in r4ds.

12.4.5 Exercise 5

Z-transforming tables

In this exercise, we will standardize an entire table of data (using a for loop, an apply, and a map function). We will first write a utility function that achieves the desired transformation for a vector and then compare and contrast different ways of applying this function to a table of data.

In case you are not familiar with the notion of a z score or standard score, look up these terms (e.g., on Wikipedia).

  1. Write a function called z_trans that takes a vector v as input and returns the z-transformed (or standardized) values as output if v is numeric and returns v unchanged if it is non-numeric.
    Hint: Remember that z <- (v - mean(v)) / sd(v)), but beware that v could contain NA values.

  2. Load the dataset for the false positive psychology (see Section B.2 of Appendix B) into falsePosPsy and remove any non-numeric variables from it.

# Load data: 
falsePosPsy <- ds4psy::falsePosPsy_all  # from ds4psy package
# falsePosPsy <- readr::read_csv("http://rpository.com/ds4psy/data/falsePosPsy_all.csv")  # online

#> # A tibble: 78 x 19
#>    study    ID  aged aged365 female   dad   mom potato when64 kalimba cond 
#>    <dbl> <dbl> <dbl>   <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl>   <dbl> <chr>
#>  1     1     1  6765    18.5      0    49    45      0      0       1 cont…
#>  2     1     2  7715    21.1      1    63    62      0      1       0 64   
#>  3     1     3  7630    20.9      0    61    59      0      1       0 64   
#>  4     1     4  7543    20.7      0    54    51      0      0       1 cont…
#>  5     1     5  7849    21.5      0    47    43      0      1       0 64   
#>  6     1     6  7581    20.8      1    49    50      0      1       0 64   
#>  7     1     7  7534    20.6      1    56    55      0      0       1 cont…
#>  8     1     8  6678    18.3      1    45    45      0      1       0 64   
#>  9     1     9  6970    19.1      0    53    51      1      0       0 pota…
#> 10     1    10  7681    21.0      0    53    51      0      1       0 64   
#> # … with 68 more rows, and 8 more variables: root <dbl>, bird <dbl>,
#> #   political <dbl>, quarterback <dbl>, olddays <dbl>, feelold <dbl>,
#> #   computer <dbl>, diner <dbl>
  • Use an appropriate map function to to create a single vector that — for each column in falsePosPsy — indicates whether or not it is a numeric variable?

Hint: The function is.numeric tests whether a vector is numeric.

  • Use this vector to select only the numeric columns of falsePosPsy into a new tibble fpp_numeric:

  • Use a for loop to apply your z_trans function to fpp_numeric to standardize all of its columns:

  • Turn your resulting data structure into a tibble out_1 and print it.

  1. Repeat the task of 2. (i.e., applying z_trans to all numeric columns of falsePosPsy) by using the base R apply function, rather than a for loop. Save and print your resulting data structure as a tibble out_2.

Hint: Remember to set the MARGIN argument to apply z_trans over all columns, rather than rows.

  1. Repeat the task of 2. and 3. (i.e., applying z_trans to all numeric columns of falsePosPsy) by using an appropriate version of a map function from the purrr package. Save and print your resulting data structure as a tibble out_3.

Hint: Note that the desired output structure is a rectangular data table, which is also a list.

  1. Use all.equal to verify that your results of 2., 3. and 4. (i.e., out_1, out_2, and out_3) are all equal.

Hint: If a tibble t1 lacks variable names, you can add those of another tibble t2 by assigning names(t1) <- names(t2).

12.4.6 Exercise 6

Cumulative savings revisited

In Exercise 2 of Chapter 1: Basic R concepts and commands, we computed the cumulative sum of an initial investment amount a = 1000, given an annual interest rate of int of .1%, and an annual rate of inflation inf of 2%, after a number of n full years (e.g., n = 10):

# Task parameters: 
a <- 1000      # initial amount: $1000
int <- .1/100  # annual interest rate of 0.1%
inf <- 2/100   # annual inflation rate 2%
n   <- 10      # number of years

Our solution in Chapter 1 consisted in an arithmetic formula which computes a new total based on the current task parameters:

# Previous solution (see Exercise 2 of Chapter 1): 
total <- a * (1 + int - inf)^n
#> [1] 825.4487

Given our new skills about writing loops and functions (from Chapter 11), we can solve this task in a variety of ways. This exercise illustrates some differences between loops, a function that implements the formula, and a vector-based solution. Although all these approaches solve the same problem, they differ in important ways.

  1. Write a for loop that iteratively computes the current value of your investment after each of 1:n years (with \(n \geq 1\)).

Hint: Express the new value of your investment a as a function of its current value a and its change based on inf and int in each year.

  1. Write a function compute_value() that takes a, int, inf, and n as its arguments, and directly computes and returns the cumulative total after n years.

Hint: Translate the arithmetic solution (shown above) into a function that directly computes the new total. Use sensible default values for your function.

  1. Write a for loop that iteratively calls your function compute_value() for every year n.

  2. Check whether your compute_value() function also works for a vector of year values n. Then discuss the differences between the solutions to Exercise 6.1, 6.3, and 6.4.

This concludes our exercises on loops and applying functions to data structures.