## 6.4 Graph: Coefficient plots with facetting

• Figure 6.2 below provides a coefficient plot for subsets/facets of the data.
• Questions:
• What does the graph show? What are the underlying variables (and data)?
• How many scales/mappings does it use? Could we reduce them?
• What do you like, what do you dislike about the figure? What is good, what is bad?
• What kind of information could we add to the graph (if any)?
• How would you approach a replication of the graph? Figure 6.2: Coefficient plot: Facetting

### 6.4.1 Lab: Data & code

Data preparations are somewhat more complicated when we want to show facets. We work with nested dataframes, as well as model results that are nested in a dataframe.

We proceed in several steps:

1. We split the dataset into subsets according to Examination_cat.
2. We estimate the linear models in those subsets (see map(..lm(...))).
3. We tidy the estimations as to obtain a nice dataframe.
4. We estimate confidence intervals also obtaining nice dataframes (see map(fit, conf.level = 0.90, confint_tidy)).
5. We rename the vars in the confidence intervals dataframes (see rename_all(...)).
6. We unnest() the data obtaining one dataframe that contains the estimates and intervals across all subsets.
7. Finally, we filter the intercepts in those estimations and the result is shown in Table ??.
swiss <- swiss %>%
mutate(Examination_cat = cut(Examination,
breaks = quantile(Examination, probs = seq(0, 1, 0.25)),
labels = c("lowest", "lower", "higher", "highest")))
#table(swiss$Examination, swiss$Examination_cat)

results <- swiss %>%
filter(!is.na(Examination_cat)) %>%
nest(data = c(Fertility, Agriculture, Examination, Education, Catholic, Infant.Mortality)) %>%
mutate(fit = map(data, ~ lm(Fertility ~ Catholic + Agriculture + Education, data = .)),
results = map(fit, tidy),
results_90 = map(fit, conf.level = 0.90, confint_tidy), # confidence intervals
results_95 = map(fit, conf.level = 0.95, confint_tidy)) %>%
mutate(results_90 = map(results_90, ~ rename_all(., function(x){paste(x, "_90", sep="")})), # renaming
results_95 = map(results_95, ~ rename_all(., function(x){paste(x, "_95", sep="")}))) %>%
unnest(c(results, results_90, results_95)) %>%
rename(Variable = term,
Coefficient = estimate,
SE = std.error) %>%
filter(Variable != "(Intercept)")

results %>%
select(-data, -fit, -statistic, -p.value, -SE) %>%
kable("html") %>%
kable_styling(font_size = 11)
Examination_cat Variable Coefficient conf.low_90 conf.high_90 conf.low_95 conf.high_95
lower Catholic 0.2533255 0.1748007 0.3318502 0.1564218 0.3502291
lower Agriculture -0.3935888 -0.6330915 -0.1540862 -0.6891476 -0.0980301
lower Education -1.6441445 -2.1666075 -1.1216815 -2.2888912 -0.9993978
lowest Catholic 0.0880331 -0.0011328 0.1771990 -0.0225405 0.1986067
lowest Agriculture -0.4003087 -0.5752715 -0.2253459 -0.6172781 -0.1833393
lowest Education -1.5787587 -2.9797256 -0.1777918 -3.3160816 0.1585642
higher Catholic -0.3684091 -0.7215743 -0.0152440 -0.8063652 0.0695469
higher Agriculture -0.1691309 -0.4439441 0.1056823 -0.5099236 0.1716618
higher Education -0.3300658 -1.3009259 0.6407944 -1.5340183 0.8738867
highest Catholic -0.0508162 -0.8867469 0.7851146 -1.1395041 1.0378718
highest Agriculture 0.2351947 -0.3726104 0.8429998 -0.5563901 1.0267795
highest Education -0.5220767 -1.1850961 0.1409428 -1.3855708 0.3414175

Plotting the data is straightforward again. We use the same code as above but now specify the facetting with facet_grid(Examination_cat ~ .).

  # GGPLOT
ggplot(results, aes(x = Variable, y = Coefficient)) +
geom_hline(yintercept = 0, colour = gray(1/2), lty = 2) +
geom_point(aes(x = Variable,
y = Coefficient)) +
geom_linerange(aes(x = Variable,
ymin = conf.low_90,
ymax = conf.high_90),
lwd = 1) +
geom_linerange(aes(x = Variable,
ymin = conf.low_95,
ymax = conf.high_95),
lwd = 1/2) +
ggtitle("Outcome: Fertility (Subsets: Examination)") +
coord_flip() +
facet_grid(Examination_cat ~ .)