5.3 Barplot: Unsummarized vs. summarized data
geom_bar()
: Expects unsummarised data (each observation contributes one unit to the height of each bar)geom_bar(stat ="identity")
: Tellgeom_bar
not to aggregate/summarize the data!
# data_twitter_influence.csv
data <- read_csv(sprintf("https://docs.google.com/uc?id=%s&export=download",
"1dLSTUJ5KA-BmAdS-CHmmxzqDFm2xVfv6"),
col_types = cols())
p1 <- ggplot(data, aes(x = party)) +
geom_bar() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Or summarize the data first:
data_plot <- data %>% group_by(party) %>% summarize(n = n()) %>% ungroup()
p2 <- ggplot(data_plot, aes(x = party, y = n)) +
geom_bar(stat ="identity") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
grid.arrange(p1, p2, ncol=2)

Figure 5.4: 2 Barplots: Summarize vs. identity