10.2 An example from the sex disrimination in the workplace study
Here we load a couple necessary packages, load the data, and take a glimpse()
.
library(tidyverse)
protest <- read_csv("data/protest/protest.csv")
glimpse(protest)
## Observations: 129
## Variables: 6
## $ subnum <int> 209, 44, 124, 232, 30, 140, 27, 64, 67, 182, 85, 109, 122, 69, 45, 28, 170, 66...
## $ protest <int> 2, 0, 2, 2, 2, 1, 2, 0, 0, 0, 2, 2, 0, 1, 1, 0, 1, 2, 2, 1, 2, 1, 1, 2, 2, 0, ...
## $ sexism <dbl> 4.87, 4.25, 5.00, 5.50, 5.62, 5.75, 5.12, 6.62, 5.75, 4.62, 4.75, 6.12, 4.87, ...
## $ angry <int> 2, 1, 3, 1, 1, 1, 2, 1, 6, 1, 2, 5, 2, 1, 1, 1, 2, 1, 3, 4, 1, 1, 1, 5, 1, 5, ...
## $ liking <dbl> 4.83, 4.50, 5.50, 5.66, 6.16, 6.00, 4.66, 6.50, 1.00, 6.83, 5.00, 5.66, 5.83, ...
## $ respappr <dbl> 4.25, 5.75, 4.75, 7.00, 6.75, 5.50, 5.00, 6.25, 3.00, 5.75, 5.25, 7.00, 4.50, ...
With a little ifelse()
, computing the dummies D1
and D2
is easy enough.
protest <-
protest %>%
mutate(D1 = ifelse(protest == 1, 1, 0),
D2 = ifelse(protest == 2, 1, 0))
Load brms.
library(brms)
With model1
and model2
we fit the multicategorical multivariable model and the multicategorical moderation models, respectively.
model1 <-
brm(data = protest, family = gaussian,
liking ~ 1 + D1 + D2 + sexism,
chains = 4, cores = 4)
model2 <-
update(model1,
newdata = protest,
liking ~ 1 + D1 + D2 + sexism + D1:sexism + D2:sexism,
chains = 4, cores = 4)
Behold the \(R^2\)s.
r2s <-
bayes_R2(model1, summary = F) %>%
as_tibble() %>%
rename(`Model 1` = R2) %>%
bind_cols(
bayes_R2(model2, summary = F) %>%
as_tibble() %>%
rename(`Model 2` = R2)
) %>%
mutate(`The R2 difference` = `Model 2` - `Model 1`)
r2s %>%
gather() %>%
# This line isn't necessary, but it sets the order the summaries appear in
mutate(key = factor(key, levels = c("Model 1", "Model 2", "The R2 difference"))) %>%
group_by(key) %>%
summarize(mean = mean(value),
median = median(value),
ll = quantile(value, probs = .025),
ul = quantile(value, probs = .975)) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 5
## key mean median ll ul
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Model 1 0.071 0.067 0.012 0.156
## 2 Model 2 0.155 0.154 0.064 0.256
## 3 The R2 difference 0.084 0.084 -0.039 0.202
Interestingly, even though our posterior means and medians for the model-specific \(R^2\) values differed some from the OLS estimates in the text, their difference corresponded quite nicely to the one in the text. Let’s take a look at their distributions.
r2s %>%
gather() %>%
ggplot(aes(x = value)) +
geom_density(size = 0, fill = "grey33") +
scale_y_continuous(NULL, breaks = NULL) +
facet_wrap(~key, scales = "free_y") +
theme_minimal()
The model coefficient summaries cohere well with those in Table 10.1.
print(model1, digits = 3)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: liking ~ 1 + D1 + D2 + sexism
## Data: protest (Number of observations: 129)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 4.749 0.619 3.557 5.962 4000 1.000
## D1 0.497 0.233 0.036 0.958 4000 1.000
## D2 0.448 0.227 0.003 0.891 4000 1.000
## sexism 0.110 0.118 -0.118 0.340 4000 1.000
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 1.044 0.067 0.918 1.184 4000 0.999
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
print(model2, digits = 3)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: liking ~ D1 + D2 + sexism + D1:sexism + D2:sexism
## Data: protest (Number of observations: 129)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 7.709 1.055 5.721 9.828 1502 1.001
## D1 -4.110 1.522 -7.057 -1.093 1551 1.000
## D2 -3.498 1.429 -6.246 -0.722 1415 1.001
## sexism -0.473 0.206 -0.889 -0.077 1485 1.001
## D1:sexism 0.897 0.291 0.319 1.462 1524 1.001
## D2:sexism 0.778 0.279 0.239 1.318 1406 1.001
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 1.006 0.064 0.890 1.143 2865 1.000
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).