10.5 Additional aggregation functions
There are many, many other aggregation functions that I haven’t covered in this chapter – mainly because I rarely use them. In fact, that’s a good reminder of a peculiarity about R, there are many methods to achieve the same result, and your choice of which method to use will often come down to which method you just like the most.
To easily calculate means (or sums) across all rows or columns in a matrix or dataframe, use
For example, imagine we have the following data frame representing scores from a quiz with 5 questions, where each row represents a student, and each column represents a question. Each value can be either 1 (correct) or 0 (incorrect)
# Some exam scores exam <- data.frame("q1" = c(1, 0, 0, 0, 0), "q2" = c(1, 0, 1, 1, 0), "q3" = c(1, 0, 1, 0, 0), "q4" = c(1, 1, 1, 1, 1), "q5" = c(1, 0, 0, 1, 1))
rowMeans() to get the average scores for each student:
# What percent did each student get correct? rowMeans(exam) ##  1.0 0.2 0.6 0.6 0.4
Now let’s use
colMeans() to get the average scores for each question:
# What percent of students got each question correct? colMeans(exam) ## q1 q2 q3 q4 q5 ## 0.2 0.6 0.4 1.0 0.6
colMeans() only work on numeric columns. If you try to apply them to non-numeric data, you’ll receive an error.
There is an entire class of
apply functions in R that apply functions to groups of data. For example,
lapply() each work very similarly to
aggregate(). For example, you can calculate the average length of movies by genre with
tapply() as follows.
with(movies, tapply(X = time, # DV is time INDEX = genre, # IV is genre FUN = mean, # function is mean na.rm = TRUE)) # Ignore missing ## Action Adventure Black Comedy ## 113 106 113 ## Comedy Concert/Performance Documentary ## 99 78 69 ## Drama Horror Multiple Genres ## 116 99 114 ## Musical Reality Romantic Comedy ## 113 44 107 ## Thriller/Suspense Western ## 112 121
lapply() all work very similarly, their main difference is in the structure of their output. For example,
lapply() returns a list (we’ll cover lists in a future chapter).