Pipes with dplyr
\(~\)
Pipes allow us to combine operations with the dplyr
functions just introduced. For example, suppose we wanted to filter and select species who have sepal length of less than the average. We will create a variable named this, as you will usually not want to have a run on table printing out in your document.
library(dplyr)
less_average <- iris %>%
filter(Sepal.Length < mean(Sepal.Length)) %>%
select(Sepal.Length, Species) %>%
group_by(Species) %>%
tally
kable(head(less_average))
Species | n |
---|---|
setosa | 50 |
versicolor | 24 |
virginica | 6 |
\(~\)
Take a minute and think about what this means. Our original dataset contained 150 observations, 50 from each species yet we have all of the setosa data here. If you look back to Fisher’s first paragraph, the aim of his work was to combine the measurement information as a way to classify the flower without knowing its species to begin with.
Questions
Can you use pipes together with plots and dplyr
functions to determine which if any measurements would matter for a plant scientist to be able to discriminate between species?