7.6 Artificial categorization and subgroups

There are multiple ways to dichotomize the data by skeptic. A quick simple way is to use ifelse() to make a skeptic_hi dummy.

disaster <-
  disaster %>% 
  mutate(skeptic_hi = ifelse(skeptic >= mean(skeptic), 1, 0))

With our dummy in hand, we’re ready to fit the two models.

model6_low <-
  brm(data = disaster %>% filter(skeptic_hi == 0), 
      family = gaussian,
      justify ~ 1 + frame,
      chains = 4, cores = 4)

model6_high <-
  update(model6_low, newdata = disaster %>% filter(skeptic_hi == 1),
         chains = 4, cores = 4)

The coefficient summaries:

fixef(model6_low) %>% round(digits = 3)
##           Estimate Est.Error   Q2.5 Q97.5
## Intercept    2.626     0.099  2.431 2.824
## frame       -0.108     0.145 -0.390 0.173
fixef(model6_high) %>% round(digits = 3)
##           Estimate Est.Error  Q2.5 Q97.5
## Intercept    3.070     0.144 2.786 3.348
## frame        0.477     0.209 0.071 0.891

You can use fitted() to get the posterior means and other summaries for the two frame groups, by model.

fitted(model6_low, 
       newdata = tibble(frame = 0:1)) %>% 
  round(digits = 3)
##      Estimate Est.Error  Q2.5 Q97.5
## [1,]    2.626     0.099 2.431 2.824
## [2,]    2.518     0.105 2.309 2.724
fitted(model6_high, 
       newdata = tibble(frame = 0:1)) %>% 
  round(digits = 3)
##      Estimate Est.Error  Q2.5 Q97.5
## [1,]    3.070     0.144 2.786 3.348
## [2,]    3.548     0.149 3.256 3.836

Do note that though brms ‘Est.Error’ is the posterior \(SD\) for the coefficient, it is not the same thing as descriptive statistic \(SD\) of a subset of the data. Thus, although our ‘Estimates’ correspond nicely to the mean values Hayes reported in the middle of page 264, his \(SD\)s will not match up with our ‘Est.Error’ values, and nor should they.

Anyway, our results don’t yield \(t\)-tests. But you don’t need those anyway. We’re working within the regression paradigm! But if you’re really interested in the sub-model-implied differences between the two levels of frame by skeptic_hi subgroup, all you need is the frame coefficient of model6_low and model6_high. Here we’ll use bind_rows() to combine their posterior samples and then plot.

posterior_samples(model6_low) %>% 
  select(b_frame) %>% 
  bind_rows(
    posterior_samples(model6_high) %>%
      select(b_frame)
  ) %>% 
  mutate(model = rep(c("model6_low", "model6_high"), each = 4000)) %>% 
    
  ggplot(aes(x = b_frame, fill = model)) +
  geom_density(size = 0, alpha = .8) +
  scale_fill_manual(NULL, 
                    values = dutchmasters$little_street[c(1, 9)] %>% as.character()) +
  scale_y_continuous(NULL, breaks = NULL) +
  labs(subtitle = "The difference score distributions between frame levels,\ncolor coded by mean-split skeptic",
       x = NULL) +
  theme_07

As within the frequentist paradigm, please don’t mean split as a Bayesian. When possible, use all available data and use the regression formula to model theoretically-meaningful variables in your analyses.