16.2 The structure of a custom function

A function is simply an object that (usually) takes some arguments, performs some action (executes some R code), and then (usually) returns some output. This might sound complicated, but you’ve been using functions pre-defined in R throughout this book. For example, the function mean() takes a numeric vector as an argument, and then returns the arithmetic mean of that vector as a single scalar value.

Your custom functions will have the following 4 attributes:

  1. Name: What is the name of your function? You can give it any valid object name. However, be careful not to use names of existing functions or R might get confused.

  2. Arguments: What are the inputs to the function? Does it need a vector of numeric data? Or some text? You can specify as many inputs as you want.

  3. Actions: What do you want the function to do with the inputs? Create a plot? Calculate a statistic? Run a regression analysis? This is where you’ll write all the real R code behind the function.

  4. Output: What do you want the code to return when it’s finished with the actions? Should it return a scalar statistic? A vector of data? A dataframe?

Here’s how your function will look in R. When creating functions, you’ll use two new functions (Yes, you use functions to create functions! Very Inception-y), called function() and return(). You’ll put the function inputs as arguments to the function() function, and the output(s) as argument(s) to the return() function.

# The basic structure of a function
NAME <- function(ARGUMENTS) {

  ACTIONS

  return(OUTPUT)

}

16.2.1 Creating my.mean()

Let’s create a custom functino called my.mean() that does the exact same thing as the mean() function in R. This function will take a vector x as an argument, creates a new vector called output that is the mean of all the elements of x (by summing all the values in x and dividing by the length of x), then return the output object to the user.

# Create the function my.mean()
my.mean <- function(x) {   # Single input called x

  output <- sum(x) / length(x) # Calculate output

return(output)  # Return output to the user after running the function

}

Try running the code above. When you do, nothing obvious happens. However, R has now stored the new function my.mean() in the current working directory for later use. To use the function, we can then just call our function like any other function in R. Let’s call our new function on some data and make sure that it gives us the same result as mean():

data <- c(3, 1, 6, 4, 2, 8, 4, 2)
my.mean(data)
## [1] 3.8
mean(data)
## [1] 3.8

As you can see, our new function my.mean() gave the same result as R’s built in mean() function! Obviously, this was a bit of a waste of time as we simply recreated a built-in R function. But you get the idea…

16.2.2 Specifying multiple inputs

You can create functions with as many inputs as you’d like (even 0!). Let’s do an example. We’ll create a function called oh.god.how.much.did.i.spend that helps hungover pirates figure out how much gold they spent after a long night of pirate debauchery. The function will have three inputs: grogg: the number of mugs of grogg the pirate drank, port: the number of glasses of port the pirate drank, and crabjuice: the number of shots of fermented crab juice the pirate drank. Based on this input, the function will calculate how much gold the pirate spent. We’ll also assume that a mug of grogg costs 1, a glass of port costs 3, and a shot of fermented crab juice costs 10.

oh.god.how.much.did.i.spend <- function(grogg,
                                        port,
                                        crabjuice) {

  output <- grogg * 1 + port * 3 + crabjuice * 10

  return(output)
}

Now let’s test our new function with a few different values for the inputs grogg, port, and crab juice. How much gold did Tamara, who had had 10 mugs of grogg, 3 glasses of wine, and 0 shots of crab juice spend?

oh.god.how.much.did.i.spend(grogg = 10,
                            port = 3,
                            crabjuice = 0)
## [1] 19

Looks like Tamara spent 19 gold last night. Ok, now how about Cosima, who didn’t drink any grogg or port, but went a bit nuts on the crab juice:

oh.god.how.much.did.i.spend(grogg = 0,
                            port = 0,
                            crabjuice = 7)
## [1] 70

Cosima’s taste for crab juice set her back 70 gold pieces.

16.2.3 Including default values for arguments

When you create functions with many inputs, you’ll probably want to start adding default values. Default values are input values which the function will use if the user does not specify their own. Most functions that you’ve used so far have default values. For example, the hist() function will use default values for inputs like main, xlab, (etc.) if you don’t specify them/ Including defaults can save the user a lot of time because it keeps them from having to specify every possible input to a function.

To add a default value to a function input, just include = DEFAULT}after the input. For example, let’s add a default value of 0 to each argument in the oh.god.how.much.did.i.spend function. By doing this, R will set any inputs that the user does not specify to 0 – in other words, it will assume that if you don’t tell it how many drinks of a certain type you had, then you must have had 0.

# Including default values for function arguments
oh.god.how.much.did.i.spend <- function(grogg = 0,
                                        port = 0,
                                        crabjuice = 0) {

  output <- grogg * 1 + port * 3 + crabjuice * 10

  return(output)
}

Let’s test the new version of our function with data from Hyejeong, who had 5 glasses of port but no grogg or crab juice. Because 0 is the default, we can just ignore these arguments:

oh.god.how.much.did.i.spend(port = 5)
## [1] 15

Looks like Hyejeong only spent 15 by sticking with port.