## A.12 Solutions (12)

Here are the solutions to the exercises on loops and applying functions to data structures of Chapter 12 (Section 12.4).

### A.12.1 Exercise 1

#### Fibonacci loop and functions

1. Look up the term Fibonacci numbers and use a for loop to create a numeric vector of the first 25 Fibonacci numbers (for a series of numbers starting with 0, 1).

2. Incorporate your for loop into a fibonacci function that returns a numeric vector of the first n Fibonacci numbers. Test your function for fibonacci(n = 25).

3. Generalize your fibonacci function to also accept the first 2 elements (e1 and e2) as inputs to the series and then create the first n Fibonacci numbers given these initial elements. Test your function for fibonacci(e1 = 1, e2 = 3, n = 25).

#### Solution

1. According to Wikipedia, a Fibonacci sequence is the integer sequence of 0, 1, 1, 2, 3, 5, 8, ....

Thus, the first 2 elements $$e_{1}$$ and $$e_{2}$$ of the series need to be provided. For $$i > 2$$, each element $$e_{i}$$ is the sum of the 2 preceding elements:

$$e_{i} = e_{i-2} + e_{i-1}$$.

We turn this into a for loop as follows:

# 1: Loop to print 25 Fibonacci numbers
N <- 25             # length of series
fib <- rep(NA, 25)  # prepare output

for (i in 1:N){

# Distinguish between 3 cases:
if (i==1) { fib[i] <- 0 }  # initialize 1st element
if (i==2) { fib[i] <- 1 }  # initialise 2nd element

if (i > 2) {
fib[i] <- fib[i-2] + fib[i-1]
}

}

# Result:
fib
#>  [1]     0     1     1     2     3     5     8    13    21    34    55    89
#> [13]   144   233   377   610   987  1597  2584  4181  6765 10946 17711 28657
#> [25] 46368
1. Incorporating the for loop into a function fibonacci(n):
fibonacci <- function(n){

if (is.na(n) || (n < 1) || (n != round(n)))  {
stop("n must be a positive integer.")
}

fib <- rep(NA, n)  # initialize output vector

for (i in 1:n){

# Distinguish between 3 cases:
if (i==1) { fib[i] <- 0 }  # initialize 1st element
if (i==2) { fib[i] <- 1 }  # initialise 2nd element

if (i > 2) {
fib[i] <- fib[i-2] + fib[i-1]
}

}

return(fib)

}

Checking the function:

# Check:
fibonacci(1)
#> [1] 0
fibonacci(2)
#> [1] 0 1
fibonacci(3)
#> [1] 0 1 1
fibonacci(4)
#> [1] 0 1 1 2

# First 25 Fibonacci numbers:
fibonacci(25)
#>  [1]     0     1     1     2     3     5     8    13    21    34    55    89
#> [13]   144   233   377   610   987  1597  2584  4181  6765 10946 17711 28657
#> [25] 46368

## Errors:
# fibonacci(0)
# fibonacci(-1)
# fibonacci(3/2)
# fibonacci(NA)

Realizing that we only need the for loop when n > 2, we could re-write the same function as follows:

fibonacci <- function(n){

if (is.na(n) || (n < 1) || (n != round(n)))  {
stop("n must be a positive integer.")
}

# initialize the 1st and 2nd elements:
fib <- c(0, 1)

if (n <= 2){

fib <- fib[1:n]

} else {

# initialize output vector:
fib <- c(fib, rep(NA, (n-2)))

# loop:
for (i in 3:n){

fib[i] <- fib[i-2] + fib[i-1]

} # end for loop.
} # end if (n > 2).

return(fib)

}

Checking the function:

# Check:
fibonacci(1)
#> [1] 0
fibonacci(2)
#> [1] 0 1
fibonacci(3)
#> [1] 0 1 1
fibonacci(4)
#> [1] 0 1 1 2

# First 25 Fibonacci numbers:
fibonacci(25)
#>  [1]     0     1     1     2     3     5     8    13    21    34    55    89
#> [13]   144   233   377   610   987  1597  2584  4181  6765 10946 17711 28657
#> [25] 46368

## Errors:
# fibonacci(0)
# fibonacci(-1)
# fibonacci(3/2)
# fibonacci(NA)
1. Generalizing fibonacci(n) to a function fibonacci(e1, e2, n) is simple: The following version makes the arguments e1 and e2 optional to return the standard sequence by default.
fibonacci <- function(e1 = 0, e2 = 1, n){

if (is.na(n) || (n < 1) || (n != round(n)))  {
stop("n must be a positive integer.")
}

fib <- rep(NA, n)  # initialize output vector

for (i in 1:n){

# Distinguish between 3 cases:
if (i==1) { fib[i] <- e1 }  # initialize 1st element
if (i==2) { fib[i] <- e2 }  # initialise 2nd element

if (i > 2) {
fib[i] <- fib[i-2] + fib[i-1]
}

}

return(fib)

}

This generalized fibonacci function still allows all previous calls, like:

fibonacci(n = 25)
#>  [1]     0     1     1     2     3     5     8    13    21    34    55    89
#> [13]   144   233   377   610   987  1597  2584  4181  6765 10946 17711 28657
#> [25] 46368

but now also allows specifying different initial elements:

fibonacci(e1 = 1, e2 = 3, n = 25)
#>  [1]      1      3      4      7     11     18     29     47     76    123
#> [11]    199    322    521    843   1364   2207   3571   5778   9349  15127
#> [21]  24476  39603  64079 103682 167761

### A.12.2 Exercise 2

#### A.12.2.1 Looping for divisors

1. Write a for loop that prints out all positive divisors of the number 1000.

Hint: Use N %% x == 0 to test whether x is a divisor of N.

#### Solution

N <- 1000

for (i in 1:N){

if (N %% i == 0)
print(i)

}
#> [1] 1
#> [1] 2
#> [1] 4
#> [1] 5
#> [1] 8
#> [1] 10
#> [1] 20
#> [1] 25
#> [1] 40
#> [1] 50
#> [1] 100
#> [1] 125
#> [1] 200
#> [1] 250
#> [1] 500
#> [1] 1000
1. How many iterations did your loop require? Could you achieve the same results with fewer iterations?

#### Solution

Our first for loop required N = 1000 iterations. However, realizing that the largest divisor cannot exceed N/2 (for even numbers), and N\3 (for odd numbers), we could achieve the same results with far fewer iterations:

# Using a sequence of N/2 + 1 loops:
N <- 1000

if (N %% 2 == 0){
seq <- c(1:(N/2), N)  # seq of length N/2 + 1
} else if (N %% 2 == 1){
seq <- c(1:floor(N/3), N)  # seq of length N/3 + 1
} else {
stop("N is neither even nor odd.")
}

# seq
length(seq)
#> [1] 501

for (i in seq){

if (N %% i == 0)
print(i)

}
#> [1] 1
#> [1] 2
#> [1] 4
#> [1] 5
#> [1] 8
#> [1] 10
#> [1] 20
#> [1] 25
#> [1] 40
#> [1] 50
#> [1] 100
#> [1] 125
#> [1] 200
#> [1] 250
#> [1] 500
#> [1] 1000
1. Write a divisors function that uses a for loop to return a numeric vector containing all positive divisors of a natural number N.

Hint: Note that we do not know the length of the resulting vector.

#### Solution

divisors <- function(N){

if ( is.na(N) || (N < 1) || (N %% 1 != 0)) { stop("N should be a natural number.") }

# initialize:
out <- c()  # prepare output vector

if (N %% 2 == 0){
seq <- c(1:(N/2), N)  # seq of length N/2 + 1
} else if (N %% 2 == 1){
seq <- c(1:floor(N/3), N)  # seq of length N/3 + 1
}

if (N == 1) {

return(1) # done

} else {

# loop:
for (i in seq){

if (N %% i == 0) {  # i is a divisor of N:
out <- c(out, i)  # add i to out
}

} # end loop.

return(out)

}  # end if.

}

# Check:
divisors(1)
#> [1] 1
divisors(8)
#> [1] 1 2 4 8
divisors(9)
#> [1] 1 3 9
divisors(12)
#> [1]  1  2  3  4  6 12
divisors(13)
#> [1]  1 13

## Errors for:
# divisors(NA)
# divisors(-10)
# divisors(1/2)
1. Use your divisors function to answer the question: Does the number 1001 have fewer or more divisors than the number 1000?
d_1000 <- divisors(1000)
d_1000
#>  [1]    1    2    4    5    8   10   20   25   40   50  100  125  200  250  500
#> [16] 1000

d_1001 <- divisors(1001)
d_1001
#> [1]    1    7   11   13   77   91  143 1001

# 1000 has more divisors than 1001:
length(d_1000) > length(d_1001)
#> [1] TRUE
1. Use your divisors function and another for loop to answer the question: Which prime numbers exist between the number 111 and the number 1111?

Hint: A prime number (e.g., 13) has only 2 divisors: 1 and the number itself.

# Parameters:
range_min <- 111
range_max <- 1111
primes_found <- c()  # initialize

for (i in range_min:range_max) {

if (length(divisors(i)) == 2){

primes_found <- c(primes_found, i)

}

}

# Solution:
primes_found
#>  [1] 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223
#> [20] 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331
#> [39] 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443
#> [58] 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569
#>  [ reached getOption("max.print") -- omitted 82 entries ]
length(primes_found)
#> [1] 157

Note some details:

• The loop above uses the divisors function within a for loop (in the range range_min:range_max). As the divisors function also uses a for loop to find all divisors of N, we are using a loop inside a loop. As such structures can quickly become very inefficient, it is a good idea to try reducing the number of iterations when possible.

• The condition length(divisors(i)) == 2 would fail to detect a prime number i=1. A more general solution would first define an is_prime function and then use is_prime(i) in the if statement of the for loop:

is_prime <- function(N){

if ( is.na(N) || (N < 1) || (N %% 1 != 0)) { stop("N should be a natural number.") }

out <- NA  # initialize

if (N == 1) {

out <- TRUE  # 1 is a prime number

} else {

if (length(divisors(N)) == 2) {
out <- TRUE
} else {
out <- FALSE
}

}

return(out)

}

# Check:
is_prime(1)
#> [1] TRUE
is_prime(101)
#> [1] TRUE

is_prime(111)
#> [1] FALSE
divisors(111)
#> [1]   1   3  37 111

# Errors for:
# is_prime(NA)
# is_prime(0)
# is_prime(3/2)

The is_prime function can be written in many different ways, of course. Check out this stackoverflow thread for solutions.

### A.12.3 Exercise 3

• In Chapter 1, we first explored the ds4psy functions coin() and dice() (see Section 1.6.4 and Exercise 3 in Section 1.8.3).

• In Exercise 4 of Chapter 11 (see Section 11.5.4), we wrote my_coin() and my_dice() functions by calling either these ds4psy functions or the base R sample() function.

• In this exercise, we will use for and while loops to repeatedly call an existing function.

#### Throwing dice in loops

1. Implement a function my_dice that uses the base R function sample() to simulate a throw of a dice (i.e., yielding an integer from 1 to 6 with equal probability).

#### Solution

my_dice <- function() {
sample(x = 1:6, size = 1)
}

# Check:
my_dice()
#> [1] 3
# Using for loop to throw dice 10 times:
for (i in 1:10){
print(my_dice())
}
#> [1] 4
#> [1] 3
#> [1] 4
#> [1] 2
#> [1] 1
#> [1] 4
#> [1] 2
#> [1] 6
#> [1] 6
#> [1] 1
1. Add an argument N (for the number of throws) to your function and modify it by using a for loop to throw the dice N times, and returning a vector of length N that shows the results of the N throws.

Hint: This task corresponds to Exercise 4 of Chapter 11 (see Section 11.5.4).

#### Solution

my_dice <- function(N = 1) {

out <- rep(NA, N)

for (i in 1:N){

out[i] <- sample(x = 1:6, size = 1)

} # end for loop.

return(out)

}

# Check:
my_dice()
#> [1] 2
my_dice(10)  # throw dice 10 times
#>  [1] 4 1 6 4 1 5 1 1 3 2

Note: As the sample function contains a size argument, a simpler version of the same function could have been:

my_dice <- function(N = 1) {
sample(x = 1:6, size = N, replace = TRUE)
}

# Check:
my_dice()
#> [1] 2
my_dice(10)  # throw dice 10 times
#>  [1] 2 6 2 2 5 5 4 6 1 2
1. Use a while loop to throw my_dice(N = 1) until you throw the number 6 twice in a row and show the sequence of all throws up to this point.

Hint: Given a sequence throws, the i-th element is throws[i]. Hence, the last element of throws is throws[length(throws)].

#### Solution

throws <- c(my_dice(1), my_dice(1))  # first 2 throws
throws
#> [1] 2 4

while (!( (throws[length(throws) - 1] == 6) &&
(throws[length(throws)] == 6) )) {

throws <- c(throws, my_dice(1))  # throw dice(1) and add it to throws

}

throws
#>  [1] 2 4 3 4 3 6 5 6 2 3 2 5 2 6 1 5 3 4 4 5 1 4 6 2 2 4 4 5 1 3 2 2 1 5 3 5 6 1
#> [39] 5 5 5 4 1 2 5 6 6
1. Use your solution of 3. to conduct a simulation that addresses the following question:
• How many times on average do we need to throw my_dice(1) to obtain the number 6 twice in a row?

Hint: Use a for loop to run your solution to 3. for T = 10000 times and store the length of the individual throws in a numeric vector.

#### Solution

T <- 10000

out <- rep(NA, T)  # initialize output vector

for (n in 1:T){

throws <- c(my_dice(1), my_dice(1))  # first 2 throws

while (!( (throws[length(throws) - 1] == 6) &&
(throws[length(throws)] == 6) )) {

throws <- c(throws, my_dice(1))  # throw dice(1) and add it to throws
}

out[n] <- length(throws)

}

# Results:
summary(out)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
#>    2.00   13.00   30.00   41.88   57.00  382.00

A histogram shows the distribution of the number of throws needed:

library(tidyverse)
library(unikn)

# Turn out into a tibble:
tb <- tibble(nr = out)
# tb

# Define labels:
cp_lbl <- paste0("(Simulation of T = ", T, " trials)")
mn_lbl <- paste0("mean = ", round(mean(out), 2))
md_lbl <- paste0("median = ", round(median(out), 2))

# Histogram of individual number of throws:
ggplot(tb) +
geom_histogram(aes(x = nr), binwidth = 5, fill = Seeblau, color = "white") +
coord_cartesian(xlim = c(0, 200)) +  # do not show values beyond x = 200
# Show mean and median:
geom_vline(xintercept = mean(out), linetype = 1, color = Bordeaux) +  # mean line
annotate("text", label = mn_lbl, x = mean(out) + 20, y = 950, color = Bordeaux) +
geom_vline(xintercept = median(out), linetype = 2, color = Karpfenblau) +  # median line
annotate("text", label = md_lbl, x = median(out) + 18, y = 750, color = Karpfenblau) +
# Text and formatting:
labs(title = "Number of throws needed to get 6 twice in a row",
x = "N", y = "Frequency", caption = cp_lbl) +
ds4psy::theme_ds4psy(col_title = "black")

#### Disclaimer

This exercise shows how loops can be used to generate and collect multiple outputs. This can sometimes replace vector arguments to functions. However, as R is optimized for vectors, using loops rather than vectors is not generally recommended.

### A.12.4 Exercise 4

#### Mapping functions to data

Write code that uses a function of the base R apply or purrr map family of functions to:

1. Compute the mean of every column in mtcars.
2. Determine the type of each column in ggplot2::diamonds.
3. Compute the number of unique values in each column of iris.
4. Generate 10 random normal numbers for each of μ = −100, 0, and 100.

Note: This exercise is based on Exercise 1 of Chapter 21.5.3 in r4ds.

#### Solution

# 1. Compute the mean of every column in mtcars:

# (a) Solve for 1st column:
mean(mtcars$mpg) #> [1] 20.09062 # (b) Generalize to all columns: as_tibble(mtcars) %>% map_dbl(mean) #> mpg cyl disp hp drat wt qsec #> 20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750 #> vs am gear carb #> 0.437500 0.406250 3.687500 2.812500 apply(X = mtcars, MARGIN = 2, FUN = mean) #> mpg cyl disp hp drat wt qsec #> 20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750 #> vs am gear carb #> 0.437500 0.406250 3.687500 2.812500 # 2. Determine the type of each column in ggplot2::diamonds: # (a) Solve for 1st column: typeof(ggplot2::diamonds$carat)  # solution for 1st column
#> [1] "double"

# (b) Generalize to all columns:
ggplot2::diamonds %>% map_chr(typeof)
#>     carat       cut     color   clarity     depth     table     price         x
#>  "double" "integer" "integer" "integer"  "double"  "double" "integer"  "double"
#>         y         z
#>  "double"  "double"
apply(X = ggplot2::diamonds, MARGIN = 2, FUN = typeof)
#>       carat         cut       color     clarity       depth       table
#> "character" "character" "character" "character" "character" "character"
#>       price           x           y           z
#> "character" "character" "character" "character"
# Note: All variables viewed as characters!

# 3. Compute the number of unique values in each column of iris:

# (a) Solve for 1st column:
n_distinct(iris$Sepal.Length) # solution for 1st column #> [1] 35 # (b) Generalize to all columns: as_tibble(iris) %>% map_int(n_distinct) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 35 23 43 22 3 apply(X = iris, MARGIN = 2, FUN = n_distinct) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 35 23 43 22 3 # 4. Generate 10 random normal numbers for each of μ = −100, 0, and 100: # (a) Solve for 1st mean: mu <- c(-100, 0, 100) rnorm(n = 10, mean = mu[1]) #> [1] -99.50863 -99.71473 -100.44417 -98.88835 -99.96750 -100.61690 #> [7] -100.08146 -99.68594 -98.64650 -100.27637 # (b) Generalize to all means: mu %>% map(rnorm, n = 10) %>% str() #> List of 3 #>$ : num [1:10] -99.3 -100 -99.6 -100.6 -101.6 ...
#>  $: num [1:10] -1.503 -0.704 -1.304 -0.779 0.449 ... #>$ : num [1:10] 99.7 99.7 101 100.5 100.6 ...
lapply(X = mu, FUN = rnorm, n = 10) %>% str()
#> List of 3
#>  $: num [1:10] -99 -100.8 -100.9 -99.5 -100 ... #>$ : num [1:10] 0.767 -0.449 -0.938 1.927 -0.862 ...
#> [1] TRUE
is.numeric(falsePosPsy$cond) #> [1] FALSE # (b) Generalize to entire table: numeric_cols <- falsePosPsy %>% map_lgl(is.numeric) numeric_cols #> study ID aged aged365 female dad #> TRUE TRUE TRUE TRUE TRUE TRUE #> mom potato when64 kalimba cond root #> TRUE TRUE TRUE TRUE FALSE TRUE #> bird political quarterback olddays feelold computer #> TRUE TRUE TRUE TRUE TRUE TRUE #> diner #> TRUE • Use this vector to select only the numeric columns of falsePosPsy into a new tibble fpp_numeric: #### Solution # Indexing columns of falsePosPsy by numeric_cols: fpp_numeric <- falsePosPsy[ , numeric_cols] # fpp_numeric # Alternative solution: fpp_numeric_1 <- falsePosPsy[, map_lgl(falsePosPsy, is.numeric)] all.equal(fpp_numeric, fpp_numeric_1) #> [1] TRUE # Using dplyr::select: fpp_numeric_2 <- falsePosPsy %>% dplyr::select(-cond) # notice that cond is non-numeric all.equal(fpp_numeric, fpp_numeric_2) #> [1] TRUE • Use a for loop to apply your z_trans function to fpp_numeric to standardize all of its columns: #### Solution out <- c() # prepare output vector # for loop: for (i in seq_along(fpp_numeric)){ out <- cbind(out, z_trans(fpp_numeric[[i]])) } # Result: # out • Turn your resulting data structure into a tibble out_1 and print it. #### Solution # Print result (as tibble): out_1 <- as_tibble(out) out_1 #> # A tibble: 78 x 18 #> V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 -0.873 -1.70 -0.902 -0.902 -0.873 -0.756 -1.13 -0.807 -0.682 1.59 #> 2 -0.873 -1.65 0.128 0.128 1.13 2.02 2.11 -0.807 1.45 -0.623 #> 3 -0.873 -1.61 0.0362 0.0362 -0.873 1.63 1.54 -0.807 1.45 -0.623 #> 4 -0.873 -1.57 -0.0582 -0.0582 -0.873 0.237 0.0171 -0.807 -0.682 1.59 #> 5 -0.873 -1.52 0.274 0.274 -0.873 -1.15 -1.51 -0.807 1.45 -0.623 #> 6 -0.873 -1.48 -0.0170 -0.0170 1.13 -0.756 -0.173 -0.807 1.45 -0.623 #> 7 -0.873 -1.43 -0.0680 -0.0680 1.13 0.634 0.779 -0.807 -0.682 1.59 #> 8 -0.873 -1.39 -0.996 -0.996 1.13 -1.55 -1.13 -0.807 1.45 -0.623 #> 9 -0.873 -1.35 -0.680 -0.680 -0.873 0.0382 0.0171 1.22 -0.682 -0.623 #> 10 -0.873 -1.30 0.0915 0.0915 -0.873 0.0382 0.0171 -0.807 1.45 -0.623 #> # … with 68 more rows, and 8 more variables: V11 <dbl>, V12 <dbl>, V13 <dbl>, #> # V14 <dbl>, V15 <dbl>, V16 <dbl>, V17 <dbl>, V18 <dbl> Note that we use cbind rather than c within the for loop to add the results of z_trans to out. This is because z_trans returns a vector for every column of only_numeric. Alternatively, we also could have constructed a very long vector (with a length of nrow(fpp_numeric) x ncol(fpp_numeric) = 78 x 18 = 1404) and turned it into a rectangular table later. 1. Repeat the task of 2. (i.e., applying z_trans to all numeric columns of falsePosPsy) by using the base R apply function, rather than a for loop. Save and print your resulting data structure as a tibble out_2. Hint: Remember to set the MARGIN argument to apply z_trans over all columns, rather than rows. #### Solution # Data: # fpp_numeric out_2 <- apply(X = fpp_numeric, MARGIN = 2, FUN = z_trans) # Print result (as tibble): out_2 <- as_tibble(out_2) out_2 #> # A tibble: 78 x 18 #> study ID aged aged365 female dad mom potato when64 kalimba #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 -0.873 -1.70 -0.902 -0.902 -0.873 -0.756 -1.13 -0.807 -0.682 1.59 #> 2 -0.873 -1.65 0.128 0.128 1.13 2.02 2.11 -0.807 1.45 -0.623 #> 3 -0.873 -1.61 0.0362 0.0362 -0.873 1.63 1.54 -0.807 1.45 -0.623 #> 4 -0.873 -1.57 -0.0582 -0.0582 -0.873 0.237 0.0171 -0.807 -0.682 1.59 #> 5 -0.873 -1.52 0.274 0.274 -0.873 -1.15 -1.51 -0.807 1.45 -0.623 #> 6 -0.873 -1.48 -0.0170 -0.0170 1.13 -0.756 -0.173 -0.807 1.45 -0.623 #> 7 -0.873 -1.43 -0.0680 -0.0680 1.13 0.634 0.779 -0.807 -0.682 1.59 #> 8 -0.873 -1.39 -0.996 -0.996 1.13 -1.55 -1.13 -0.807 1.45 -0.623 #> 9 -0.873 -1.35 -0.680 -0.680 -0.873 0.0382 0.0171 1.22 -0.682 -0.623 #> 10 -0.873 -1.30 0.0915 0.0915 -0.873 0.0382 0.0171 -0.807 1.45 -0.623 #> # … with 68 more rows, and 8 more variables: root <dbl>, bird <dbl>, #> # political <dbl>, quarterback <dbl>, olddays <dbl>, feelold <dbl>, #> # computer <dbl>, diner <dbl> 1. Repeat the task of 2. and 3. (i.e., applying z_trans to all numeric columns of falsePosPsy) by using an appropriate version of a map function from the purrr package. Save and print your resulting data structure as a tibble out_3. Hint: Note that the desired output structure is a rectangular data table, which is also a list. #### Solution # Data: # fpp_numeric # Using map to return a list: out_3 <- purrr::map(.x = fpp_numeric, .f = z_trans) # Print result (as tibble): out_3 <- as_tibble(out_3) out_3 #> # A tibble: 78 x 18 #> study ID aged aged365 female dad mom potato when64 kalimba #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 -0.873 -1.70 -0.902 -0.902 -0.873 -0.756 -1.13 -0.807 -0.682 1.59 #> 2 -0.873 -1.65 0.128 0.128 1.13 2.02 2.11 -0.807 1.45 -0.623 #> 3 -0.873 -1.61 0.0362 0.0362 -0.873 1.63 1.54 -0.807 1.45 -0.623 #> 4 -0.873 -1.57 -0.0582 -0.0582 -0.873 0.237 0.0171 -0.807 -0.682 1.59 #> 5 -0.873 -1.52 0.274 0.274 -0.873 -1.15 -1.51 -0.807 1.45 -0.623 #> 6 -0.873 -1.48 -0.0170 -0.0170 1.13 -0.756 -0.173 -0.807 1.45 -0.623 #> 7 -0.873 -1.43 -0.0680 -0.0680 1.13 0.634 0.779 -0.807 -0.682 1.59 #> 8 -0.873 -1.39 -0.996 -0.996 1.13 -1.55 -1.13 -0.807 1.45 -0.623 #> 9 -0.873 -1.35 -0.680 -0.680 -0.873 0.0382 0.0171 1.22 -0.682 -0.623 #> 10 -0.873 -1.30 0.0915 0.0915 -0.873 0.0382 0.0171 -0.807 1.45 -0.623 #> # … with 68 more rows, and 8 more variables: root <dbl>, bird <dbl>, #> # political <dbl>, quarterback <dbl>, olddays <dbl>, feelold <dbl>, #> # computer <dbl>, diner <dbl> 1. Use all.equal to verify that your results of 2., 3. and 4. (i.e., out_1, out_2, and out_3) are all equal. Hint: If a tibble t1 lacks variable names, you can add those of another tibble t2 by assigning names(t1) <- names(t2). #### Solution # Note that out_1 did not retain the variable names of fpp_numeric: names(out_1) <- names(fpp_numeric) # re-adding names to out_1 # Verify equality: all.equal(out_1, out_2) #> [1] TRUE all.equal(out_2, out_3) #> [1] TRUE ### A.12.6 Exercise 6 #### Cumulative savings revisited In Exercise 2 of Chapter 1: Basic R concepts and commands, we computed the cumulative sum of an initial investment amount a = 1000, given an annual interest rate of int of .1%, and an annual rate of inflation inf of 2%, after a number of n full years (e.g., n = 10): # Task parameters: a <- 1000 # initial amount:$1000
int <- .1/100  # annual interest rate of 0.1%
inf <- 2/100   # annual inflation rate 2%
n   <- 10      # number of years

Our solution in Chapter 1 consisted in an arithmetic formula which computes a new total based on the current task parameters:

# Previous solution (see Exercise 2 of Chapter 1):
total <- a * (1 + int - inf)^n
total
#> [1] 825.4487

Given our new skills about writing loops and functions (from Chapter 11), we can solve this task in a variety of ways. This exercise illustrates some differences between loops, a function that implements the formula, and a vector-based solution. Although all these approaches solve the same problem, they differ in important ways.

1. Write a for loop that iteratively computes the current value of your investment after each of 1:n years (with $$n \geq 1$$).

Hint: Express the new value of your investment a as a function of its current value a and its change based on inf and int in each year.

#### Solution

out <- rep(NA, n)  # prepare output vector

for (i in 1:n){

# incrementally compute the new value for a (based on previous a):
a <- a * (1 + int - inf)

print(paste0("i = ", i, ": a = ", a))   # user feedback

out[i] <- a  # store current result

}
#> [1] "i = 1: a = 981"
#> [1] "i = 2: a = 962.361"
#> [1] "i = 3: a = 944.076141"
#> [1] "i = 4: a = 926.138694321"
#> [1] "i = 5: a = 908.5420591289"
#> [1] "i = 6: a = 891.279760005451"
#> [1] "i = 7: a = 874.345444565348"
#> [1] "i = 8: a = 857.732881118606"
#> [1] "i = 9: a = 841.435956377352"
#> [1] "i = 10: a = 825.448673206182"

out  # print result
#>  [1] 981.0000 962.3610 944.0761 926.1387 908.5421 891.2798 874.3454 857.7329
#>  [9] 841.4360 825.4487
1. Write a function compute_value() that takes a, int, inf, and n as its arguments, and directly computes and returns the cumulative total after n years.

Hint: Translate the solution (shown above) into a function that directly computes the new total. Use sensible default values for your function.

#### Solution

# Define function (with sensible defaults):
compute_value <- function(a = 0, int = 0, inf = 0, n = 0){

a * (1 + int - inf)^n

}

# Check:
compute_value()  # sensible default?
#> [1] 0
compute_value(a = 1000, int = 99, inf = 99, n = 123)
#> [1] 1000
compute_value(NA)
#> [1] NA

# Compute solution with task parameters:
compute_value(a = 1000, int = .1/100, inf = 2/100, n = 10)
#> [1] 825.4487
1. Write a for loop that iteratively calls your function compute_value() for every year n.

#### Solution

# Task parameters:
a <- 1000      # initial amount: $1000 int <- .1/100 # annual interest rate of 0.1% inf <- 2/100 # annual inflation rate 2% n <- 10 # number of years out <- rep(NA, n) # prepare output vector for (i in 1:n){ x <- compute_value(a = a, int = int, inf = inf, n = i) # directly compute current value print(paste0("i = ", i, ": x = ", x)) # user feedback out[i] <- x # store current result } #> [1] "i = 1: x = 981" #> [1] "i = 2: x = 962.361" #> [1] "i = 3: x = 944.076141" #> [1] "i = 4: x = 926.138694320999" #> [1] "i = 5: x = 908.5420591289" #> [1] "i = 6: x = 891.279760005451" #> [1] "i = 7: x = 874.345444565347" #> [1] "i = 8: x = 857.732881118606" #> [1] "i = 9: x = 841.435956377352" #> [1] "i = 10: x = 825.448673206182" out # print result #> [1] 981.0000 962.3610 944.0761 926.1387 908.5421 891.2798 874.3454 857.7329 #> [9] 841.4360 825.4487 1. Check whether your compute_value() function also works for a vector of year values n. Then discuss the differences between the solutions to Exercise 6.1, 6.3, and 6.4. #### Solution # Task parameters: a <- 1000 # initial amount:$1000
int <- .1/100  # annual interest rate of 0.1%
inf <- 2/100   # annual inflation rate 2%
n   <- 1:10    # RANGE of years

compute_value(a = a, int = int, inf = inf, n = n)  # directly compute a RANGE of values
#>  [1] 981.0000 962.3610 944.0761 926.1387 908.5421 891.2798 874.3454 857.7329
#>  [9] 841.4360 825.4487

Note the difference between both for loops in this exercise and the vector-based solution:

• In 6.1, the current value of a was used to iteratively compute each new value of a.

• In 6.3, we use the function from 6.2 to directly compute a specific value x for given parameter values (e.g., of a and n). The loop used in 6.1 incrementally computes the new value of a for every increment of i. Thus, the corresponding loop must begin at i = 1 and increment its index in steps of consecutive integer values (2, 3, …). By contrast, the solution of 6.3 is more general and would also work for different loop ranges (e.g., i in c(5, 10, 15)).

• The solution in 6.4 is similar to the loop used in 6.3, but replaces the increments of n by a vector of values for n.

This concludes our exercises on loops and applying functions to data structures.

[50_solutions.Rmd updated on 2020-10-22 16:50:46 by hn.]