11.4 Exercises

ds4psy: Exercises on Functions (11)

The following exercises practice how functions can be defined and evaluated and how the flow of information can be controlled.

11.4.1 Exercise 1

Imagine someone proudly presents the following 3 functions to you. Each of them takes a vector v as an input and tries to perform a simple task. For each function:

  • describe the task that the function is designed to perform,
  • test whether it successfully performs this task,
  • name any problem that you detect with the current function,
  • fix the function so that it successfully performs its task.
# (1)  ------ 
first_element <- function(v) {
  
  output <- NA   # initialize
  output <- v[1] # set to 1st
  
}

# (2) ------ 
avg_mn_med <- function(v, na_rm = TRUE) {
  
  mn  <- mean(v)
  med <- median(v)
  avg <- (mn + med)/2
  
  return(avg)
}

# (3)  ------ 
mean_sd <- function(v, na_rm = TRUE) {
  
  mn  <- mean(v)
  sd  <- sd(v)
  
  both <- c(mn, sd)
  names(both) <- c("mean", "sd")
  
  return(mn)
  
}

11.4.2 Exercise 2

Conditional feeding

Let’s write a first function and then add some conditions to it.

  1. Write a function feed_me that takes a character string food as a required argument, and returns the sentence "I love to eat ___!". Test your function by running feed_me("apples"), etc.

Here’s a template with some blanks, to get you started:

feed_me <- function(___) {
  
  output <- paste0("I love to eat ", ___, "!")
  
  print(___)
}
  1. Modify feed_me so that it returns "Nothing to eat." when food = NA.

  2. Extend your function to a feed_vegan function that uses 2 additional arguments:

    • type should be an optional character string, set to a default argument of "food". If type is not "food", the function should return "___ is not edible.".

    • vegan should be an optional Boolean value, which is set to FALSE by default. If vegan is TRUE, the function should return "I love to eat ___!". Otherwise, the function should return "I do not eat ___.".

Test each of your functions by evaluating appropriate function calls.

11.4.3 Exercise 3

Buggy number recognition

This exercise analyzes and corrects someone else’s function.

  1. Explain what the following function describe (not to be confused with describe above) intends to do and why it fails in doing it.
describe <- function(x) {
  
  if (x %% 2 == 0)  {print("x is an even number.")} 
  else if (x %% 2 == 1) {print("x is an odd number.")}
  else if (x < 1)   {print("x is too small.")} 
  else if (x > 20)  {print("x is too big.")} 
  else if (x == 13) {print("x is a lucky number.")} 
  else if (x == pi) {print("Let's make a pie!")}
  else {print("x is beyond description.")}
  
}
  1. Repair the describe function to yield the following results:
# Desired results:
describe(0)
#> [1] "x is an even number."
describe(1)
#> [1] "x is an odd number."
describe(13)
#> [1] "x is an odd number."
describe(20)
#> [1] "x is an even number."
describe(21)
#> [1] "x is an odd number."
describe(pi)
#> [1] "Let's make a pie!"
  1. What are the results of describe(NA) and describe("one")? Correct the function to yield appropriate results in both cases.

  2. For what kind of x will describe print "x is beyond description."?

11.4.4 Exercise 4

Tibble charts

This exercise writes a function to extract rows from tabular inputs based on the top values of some variable.

  1. Write a top_3 function that takes a tibble data and a the column number col_nr of a numeric variable as its 2 inputs and returns the top-3 rows of the tibble after it has been sorted (in descending order) by the specified column number.

Use the data of sw <- dplyr::starwars to illustrate your function.

Hint: To write this function, first solve its task for a specific case (e.g., for col_nr = 2). When using the dplyr commands of the tidyverse, a problem you will encounter is that a tibble’s variables are typically referenced by their unquoted names, rather than by their number (or column index). Here are 2 ways to solve this problem:

  • To obtain the unquoted name some_name of a given character string "some_name", you can call !!sym("some_name").

  • Rather than aiming for a tidyverse solution, you could solve the problem with base R commands. In this case, look up and use the command order to re-arrange the rows of a tibble or data frame.

  1. What happens in your top_3 function when col_nr refers to a character variable (e.g., dplyr::starwars[ , 1])? Adjust the function so that its result varies by the type of the variable designated by the col_nr argument:

    • if the corresponding variable is a character variable, sort the data in ascending order (alphabetically);
    • if the corresponding variable is a numeric variable, sort the data in descending order (from high to low).
  2. Generalise your top_3 function to a top_n function that returns the top n rows when sorted by col_nr. What would be a good default value for n? What should happen when n = NA and when n > nrow(data)?

Check all your functions with appropriate inputs.

Note: Functions for different tasks and data types

The following three exercises illustrate how functions can use, mix, and merge various data types to solve different tasks. Specifically, they ask you to write functions for

  • visualizing data as plots (Exercise 5),
  • printing numbers as text (Exercise 6),
  • computing with dates (Exercise 7).

11.4.5 Exercise 5

A plotting function

This exercise asks you to write a function that uses some input data for creating a specific type of plot.

  1. Write a function plot_scatter that takes a table (tibble or data frame) with 2 numeric variables x and y as my_data and plots a scatterplot of the values of y by the values of x.

Hint: First write a ggplot command that creates a scatterplot of my_data. Then wrap a function plot_scatter around this command that takes my_data as its argument.

Test your function by using the following 2 tibbles tb_1 and tb_2 as my_data:

set.seed(101)
n_num <- 100
x_val <- runif(n = n_num, min = 30, max = 90)
y_val <- runif(n = n_num, min = 30, max = 90)

tb_1 <- tibble::tibble(x = x_val, y = y_val)
tb_2 <- tibble::tibble(x = x_val, y = x_val + rnorm(n = n_num, mean = 0, sd = 10))

names(tb_1)
#> [1] "x" "y"
  1. For any table my_data that contains 2 numeric variables x and y we can fit a linear model as follows:
my_data <- tb_1

my_lm <- lm(y ~ x, data = my_data)
my_lm
#> 
#> Call:
#> lm(formula = y ~ x, data = my_data)
#> 
#> Coefficients:
#> (Intercept)            x  
#>     53.2340       0.1318

# Get the model's intercept and slope values:
my_lm$coefficients[1]  # intercept
#> (Intercept) 
#>    53.23402
my_lm$coefficients[2]  # slope
#>         x 
#> 0.1318431

Incorporate the fit of a linear model into your plot_scatter function. Use a linear model to add a line to your plot that shows the prediction of the linear model (in a color that can be set by an optional col argument).

11.4.6 Exercise 6

Printing numbers (as characters)

A common problem when printing numbers in text is that the number of digits to be printed (i.e., characters or symbols) depends on the number’s value. This means that series of different numbers often have different lengths, which makes it hard to align them (e.g., in tables). A potential solution to this is adding leading or trailing zeros (or empty spaces) to the front and back of a number.

The function num_as_char() of the ds4psy package provides a (sub-optimal) solution to this problem by containing 3 main arguments:

  • x for the number(s) to be formatted (required);
  • n_pre_dec for the number of digits prior to the decimal separator (default n_pre_dec = 2);
  • n_dec to specify the number of digits after the decimal separator (default n_dec = 2).

Additional arguments specify the symbol sym to use for filling up digit positions and the symbol used as decimal separator sep.

  1. Experiment with num_as_char() to check its functionality and limits.
  1. Write your own function num_to_char() that achieves the same (or a similar) functionality.

Hint: The num_as_char() function of the ds4psy package also works for vectors, but uses 2 for loops to achieve this. Try writing a simpler solution that works for using individual numbers as x (i.e., scalars, or vectors of length 1). If you get stuck, try adapting parts of the solution used by num_as_char.

11.4.7 Exercise 7

Computing with dates

  1. Use what you have learned in Chapter 10: Time data to write a function that takes a date or time (e.g., the date of someone’s birthday) as its input and returns the corresponding age (as a number, rounded to completed years) as output.

  2. Check your function with appropriate examples.

  3. Does your solution also work when multiple dates are entered (as a vector)?

This concludes our exercises on creating new functions.