Chapter 5 Understanding Tidyverse Functions
So far we used mutate
from dplyr
, unnest
from tidyr
, read_csv
from readr
, map
from purrr
, as_tibble
from tibble
, and str_to_title
from stringr
. Since we are using so many tidyverse packages, we may as well load the tidyverse if possible. Fortunately it is possible by running library(tidyverse)
. After doing so, all tidyverse functions are now available to us.
5.1 mutate
We already went over mutate
. This function is used to both create new columns and overwrite existing columns.
We can overwrite an existing column like so:
<- mutate(df, year = as.numeric(year)) df
Having a numeric year is useful if, for example, we want to filter the data by years between 1960 and 1980. This is done with filter
.
<- filter(df, 1960 < year & year < 1980) df
5.2 Pumping with %>%
Notice that the examples in the function documentation for mutate
(https://dplyr.tidyverse.org/reference/mutate.html) and filter
use the symbol %>%
.
The %>%
is very simple. It simply sends the left side to the right side. The other property to remember is that %>%
holds mathematical priority before brackets and exponents. So if you know BEDMASS, then you also know BE%>%
DMASS.
sqrt(5)
>> [1] 2.236068
is the same as
5 %>% sqrt
>> [1] 2.236068
But
5 * 5 %>% sqrt
>> [1] 11.18034
is not the same as
5 * 5) %>% sqrt (
>> [1] 5
For a more relevant example,
<- mutate(df, year = as.numeric(year)) df
is the same as
<- df %>% mutate(year = as.numeric(year)) df
which is the same as
<- df %>% mutate(year = year %>% as.numeric) df
5.3 Grouping with group
Notice that the examples in the function documentation for filter
(https://dplyr.tidyverse.org/reference/filter.html) use the function group_by
.
group_by
groups the data for the function(s) that follow group_by
.
Let us create a simple example to demonstrate… (To be continued)