Chapter 5 Understanding Tidyverse Functions

So far we used mutate from dplyr, unnest from tidyr, read_csv from readr, map from purrr, as_tibble from tibble, and str_to_title from stringr. Since we are using so many tidyverse packages, we may as well load the tidyverse if possible. Fortunately it is possible by running library(tidyverse). After doing so, all tidyverse functions are now available to us.

5.1 mutate

We already went over mutate. This function is used to both create new columns and overwrite existing columns.

We can overwrite an existing column like so:

df <- mutate(df, year = as.numeric(year))

Having a numeric year is useful if, for example, we want to filter the data by years between 1960 and 1980. This is done with filter.

df <- filter(df, 1960 < year & year < 1980)

5.2 Pumping with %>%

Notice that the examples in the function documentation for mutate (https://dplyr.tidyverse.org/reference/mutate.html) and filter use the symbol %>%.

The %>% is very simple. It simply sends the left side to the right side. The other property to remember is that %>% holds mathematical priority before brackets and exponents. So if you know BEDMASS, then you also know BE%>%DMASS.

sqrt(5) 
>> [1] 2.236068

is the same as

5 %>% sqrt
>> [1] 2.236068

But

5 * 5 %>% sqrt
>> [1] 11.18034

is not the same as

(5 * 5) %>% sqrt
>> [1] 5

For a more relevant example,

df <- mutate(df, year = as.numeric(year))

is the same as

df <- df %>% mutate(year = as.numeric(year))

which is the same as

df <- df %>% mutate(year = year %>% as.numeric)

5.3 Grouping with group

Notice that the examples in the function documentation for filter (https://dplyr.tidyverse.org/reference/filter.html) use the function group_by.

group_by groups the data for the function(s) that follow group_by.

Let us create a simple example to demonstrate… (To be continued)