13.12 group_by(): Applying functions across groups
- We can apply above functions to subgroups within the dataset # Observations that have certain values on a variable # Individuals with different levels of education # Individuals belonging to countries
dplyrlets you use the
group_by()function to describe how to break a dataset down into groups of rows
dplyrfunctions recognize when data frame is grouped by using
- Can be used for aggregating data
13.12.1 Example: Applying dplyr functions across groups (aggregation)
# See http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html for this example essdata <- read.dta("./Material/ESS4e04_de.dta", convert.factors=F) View(essdata) nrow(essdata) # Check number of rows names(essdata) # edulvl measures education levels by_edulvl <- group_by(essdata, edulvl_str) # convert the data frame into a # grouped data frame and save in object # Character variable to aggregate by_edulvl # we can see the group variable and the dimensions class(by_edulvl) # we can see the new class, essdata.agg <- summarise(by_edulvl, # summarise collapses data frame n = n(), # Add variable with the number of observations in group age.m = mean(age, na.rm = TRUE), # Variable containing mean hheinkommen.m = mean(hheinkommen, na.rm = TRUE)) # Variable containing mean View(essdata.agg) old.way <- aggregate(essdata, by = list(essdata$edulvl), mean, na.rm=TRUE) View(old.way)
13.12.2 Exercise: Applying dplyr functions across groups (aggregation)
- Execute the following code:
essdata <- read.dta("./Material/ESS4e04_de.dta", convert.factors=F). Adapt your file path!
- The variable
religion_strcontains the religious affiliation of respondents. Aggregate the data set - using functions from
dplyrpackage - so that you obtain averages for subgroups of religious affiliations for the variables
trustparties- as well as a variable with the number of observations across the groups.