Section 5 Functions

So far, we’ve used a few built-in tools like mean() - if you’re coming from a language like SPSS you might think of these tools as “commands,” but in R we call them functions. Functions are basically reusable chunks of code, that take a certain set of inputs (also called arguments) and either produce an output (a return value), or just do a task like showing a plot. When you plug a specific set of inputs into the function and “run” it, we say that you’re calling the function.

The mean() function in R can take a vector of numbers as an input, and return a single number as an output:

mean(data$extraversion)
## [1] 12.37298
sd(data$extraversion)
## [1] 3.893686

You can also look at means by group using the describeBy function from the psych package. In this example we look at means (and other descriptives) by treatment group.

describeBy(data$extraversion, data$treatment)
## 
##  Descriptive statistics by group 
## group: 1
##    vars   n  mean   sd median trimmed  mad min max range  skew kurtosis   se
## X1    1 329 12.46 3.87     13    12.5 4.45   4  21    17 -0.12    -0.62 0.21
## ------------------------------------------------------------------------------------------ 
## group: 2
##    vars   n  mean  sd median trimmed  mad min max range skew kurtosis   se
## X1    1 710 12.29 3.8     12   12.33 4.45   2  22    20 -0.1    -0.24 0.14
## ------------------------------------------------------------------------------------------ 
## group: 3
##    vars   n  mean  sd median trimmed  mad min max range  skew kurtosis   se
## X1    1 382 12.46 4.1     13   12.58 4.45   2  23    21 -0.25    -0.34 0.21

5.1 Arguments

The arguments of a function are the set of inputs it accepts. Some of the inputs will be used to calculate the output, while some might be different options that affect how the calculation happens.

If we look at the arguments for the default mean() function in R, accessed by entering ?mean in the console, we see:

mean(x, trim = 0, na.rm = FALSE, ...)

Since the first argument x appears on its own, it’s a mandatory argument. You have to provide a value for x, otherwise you get an error:

mean()
## Error in mean.default() : argument "x" is missing, with no default

Arguments like trim = 0 are optional when you’re calling the function: the value after the = is the default value that will be used if you don’t supply one. The default values tell you what types of input that argument accepts (numeric, logical, character, etc.), but it’s also good to read the information on the function’s help page for more detail:

trim = the fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed. Values of trim outside that range are taken as the nearest endpoint.

mean(data$personality_total)
## [1] 23.78536
# This is the same as above, since this is already the default
mean(data$personality_total, trim = 0)
## [1] 23.78536
# A different setting from the default
mean(data$personality_total, trim = 0.1)
## [1] 23.87775