# 9 Some Myths and Additional Extensions of Moderation Analysis

In this chapter, our Bayesian approach complicates some of Hayes’s flow. We’ll see how Bayesian HMC estimation can make us reconsider the value in mean centering and we’ll also slip in some missing data talk.

## 9.1 Truths and myths about mean-centering

Mean-centering has been recommended in a few highly regarded books on regression analysis , and several explanations have been offered for why mean-centering should be undertaken prior to computation of the product and model estimation. The explanation that seems to have resulted in the most misunderstanding is that $$X$$ and $$W$$ are likely to be highly correlated with $$XW$$ and this will produce estimation problems caused by collinearity and result in poor or “strange” estimates of regression coefficients, large standard errors, and reduced power of the statistical test of the interaction. But this is, in large part, simply a myth. As I describe later, there are some reasons that mean- centering the focal antecedent or moderator variables can be a beneficial thing to do, which is why it has been recommended by some. However, it is incorrect to claim that it is necessary, that a failure to do so will lead one to incorrect inferences about moderation, or that the resulting regression coefficients are somehow strange or inherently uninterpretable. (Andrew F. Hayes, 2018, pp. 304–305, emphasis in the original)

Here we load a couple necessary packages, load the data, and take a glimpse().

library(tidyverse)

glimpse(glbwarm)
## Rows: 815
## Columns: 7
## $govact <dbl> 3.6, 5.0, 6.6, 1.0, 4.0, 7.0, 6.8, 5.6, 6.0, 2.6, 1.4, 5.6, 7.0, 3.8, 3.4, 4.2, 1.0, 2… ##$ posemot  <dbl> 3.67, 2.00, 2.33, 5.00, 2.33, 1.00, 2.33, 4.00, 5.00, 5.00, 1.00, 4.00, 1.00, 5.67, 3.…
## $negemot <dbl> 4.67, 2.33, 3.67, 5.00, 1.67, 6.00, 4.00, 5.33, 6.00, 2.00, 1.00, 4.00, 5.00, 4.67, 2.… ##$ ideology <dbl> 6, 2, 1, 1, 4, 3, 4, 5, 4, 7, 6, 4, 2, 4, 5, 2, 6, 4, 2, 4, 4, 2, 6, 4, 4, 3, 4, 5, 4,…
## $age <dbl> 61, 55, 85, 59, 22, 34, 47, 65, 50, 60, 71, 60, 71, 59, 32, 36, 69, 70, 41, 48, 38, 63… ##$ sex      <dbl> 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0,…
## partyid <dbl> 2, 1, 1, 1, 1, 2, 1, 1, 2, 3, 2, 1, 1, 1, 1, 1, 2, 3, 1, 3, 2, 1, 3, 2, 1, 1, 1, 3, 1,… Before we fit our models, we’ll go ahead and make our mean-centered predictors, negemot_c and age_c. glbwarm <- glbwarm %>% mutate(negemot_c = negemot - mean(negemot), age_c = age - mean(age)) Now we’re ready to fit Models 1 and 2. But before we do, it’s worth repeating part of the text: Let’s load brms. library(brms) As we’ll see in just a bit, there are some important reasons for Bayesians using HMC to mean center that wouldn’t pop up within the OLS paradigm. First let’s fit model9.1 and model9.2. model9.1 follows the conventional moderation equation $Y = i_Y + b_1 X + b_2 W + b_3 XW + e_Y.$ model9.2 is our mean-centered model, which we can express formally as \begin{align*} Y & = i_Y + b_1 \left (X - \overline X \right ) + b_2 \left (W - \overline W \right ) + b_3 \left (X - \overline X \right ) \left (W - \overline W \right ) + e_Y, \; \text{or more simply} \\ & = i_Y + b_1 X' + b_2 W' + b_3 X'W' + e_Y, \end{align*} where $$X' = \left (X - \overline X \right)$$ and so on. model9.1 <- brm(data = glbwarm, family = gaussian, govact ~ 1 + negemot + age + negemot:age, cores = 4, file = "fits/model09.01") model9.2 <- update(model9.1, newdata = glbwarm, govact ~ 1 + negemot_c + age_c + negemot_c:age_c, cores = 4, file = "fits/model09.02") As with Hayes’s OLS models, our HMC models yield the same Bayesian $$R^2$$ distributions, within simulation error. bayes_R2(model9.1) %>% round(digits = 3) ## Estimate Est.Error Q2.5 Q97.5 ## R2 0.354 0.022 0.31 0.395 bayes_R2(model9.2) %>% round(digits = 3) ## Estimate Est.Error Q2.5 Q97.5 ## R2 0.354 0.022 0.31 0.395 Our model summaries also correspond nicely with those at the top of Table 9.1. print(model9.1, digits = 3) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact ~ 1 + negemot + age + negemot:age ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 4.330 0.331 3.681 4.979 1.001 1587 1803 ## negemot 0.149 0.085 -0.014 0.321 1.001 1614 1942 ## age -0.031 0.006 -0.042 -0.019 1.001 1569 1951 ## negemot:age 0.007 0.002 0.004 0.010 1.001 1587 1935 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 1.097 0.026 1.046 1.148 1.004 2537 2319 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). print(model9.2, digits = 3) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact ~ negemot_c + age_c + negemot_c:age_c ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 4.597 0.040 4.517 4.676 1.002 3364 2390 ## negemot_c 0.501 0.025 0.451 0.552 1.002 3683 2546 ## age_c -0.005 0.002 -0.010 -0.001 1.002 5159 3161 ## negemot_c:age_c 0.007 0.002 0.004 0.010 1.000 5528 3166 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 1.098 0.028 1.045 1.155 1.000 3397 2642 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). However, notice the ‘Bulk_ESS’ and ‘Tail_ESS’ columns. The values for model9.2 were substantially larger than those for model9.1. These columns denote the number of effective samples (a.k.a. the effective sample size). Versions of brms before 2.10.0 returned a single effective sample size (ESS) value per parameter. We will cover where the change came from in a bit. But first, recall that we’ve been using brms defaults, which results in 4 HMC chains, each of which contains 2,000 draws (iterations), the first 1,000 of which are warmup values. After we discard the warmup values, that leaves 1,000 draws from each chain–4,000 total. As it turns out, Markov chains, and thus HMC chains, are typically autocorrelated, which means that each draw is partially dependent on the previous draw. Ideally, the autocorrelations are near zero. That’s often not the case. The bayesplot package offers a variety of diagnostic plots. Here we’ll use the mcmc_acf() function to make autocorrelation plots for all model parameters. Note that when we add add_chain = T to brms::posterior_samples(), we add an index to the data that allows us to keep track of which iteration comes from which chain. That index will come in handy for our mcmc_acf() plots. But before we get there, we’ll be using an xkcd-inspired theme with help from the xkcd package for our plots in this chapter. # install.packages("xkcd", dependencies = T) library(xkcd) If you haven’t used the xkcd package, before, you might also need to take a few extra steps outlined in the paper by Torres-Manzanera (n.d.), xkcd: Plotting XKCD graphs, part of which requires help from the extrafont package . library(extrafont) download.file("http://simonsoftware.se/other/xkcd.ttf", dest = "xkcd.ttf", mode = "wb") system("mkdir ~/.fonts") system("cp xkcd.ttf ~/.fonts") # this line of code returned an error message # font_import(pattern = "[X/x]kcd", prompt = FALSE) # this line from (https://stackoverflow.com/questions/49221040/error-in-font-import-while-installing-xkcd-font) fixed the problem font_import(path = "~/.fonts", pattern = "[X/x]kcd", prompt=FALSE) fonts() fonttable() if(.PlatformOS.type != "unix") {
## Register fonts for Windows bitmap output
} else {
}

After installing, I still experienced error messages, which were alleviated after I followed these steps outlined by Remi.b. You may or may not need them.

But anyways, here are our mcmc_acf() plots.

library(bayesplot)

post1 <- posterior_samples(model9.1, add_chain = T)
mcmc_acf(post1,
pars = c("b_Intercept", "b_negemot", "b_age", "b_negemot:age", "sigma"),
lags = 4) +
theme_xkcd()

post2 <- posterior_samples(model9.2, add_chain = T)
mcmc_acf(post2,
pars = c("b_Intercept", "b_negemot_c", "b_age_c", "b_negemot_c:age_c", "sigma"),
lags = 4) +
theme_xkcd() 

As it turns out, theme_xkcd() can’t handle special characters like "_", so it returns rectangles instead. So it goes…

But again, high autocorrelations in the HMC chains have consequences for the effective sample size. In their (2020) vignette, Visual MCMC diagnostics using the bayesplot package, Gabry and Modrák wrote:

The effective sample size is an estimate of the number of independent draws from the posterior distribution of the estimand of interest. Because the draws within a Markov chain are not independent if there is autocorrelation, the effective sample size, $$n_{eff}$$, will be smaller than the total sample size, $$N$$. The larger the ratio of $$n_{eff}$$ to $$N$$ the better.

In that quote, they spoke as if there was only one measure of ESS. Though this has been the case for some time, times have changed. In a (2019) paper, Stan-team allstars Vehtari, Gelman, Simpson, Carpenter, and Bürkner proposed two measures of ESS: “bulk-ESS” and “tail-ESS.” From their paper, we read:

if you plan to report quantile estimates or posterior intervals, we strongly suggest assessing the convergence of the chains for these quantiles. In Section 4.3 we show that convergence of Markov chains is not uniform across the parameter space and propose diagnostics and effective sample sizes specifically for extreme quantiles. This is different from the standard ESS estimate (which we refer to as the “bulk-ESS”), which mainly assesses how well the centre of the distribution is resolved. Instead, these “tail-ESS” measures allow the user to estimate the MCSE for interval estimates. (p. 5, emphasis in the original)

You can read the paper for technical details. In short, the Bulk_ESS in the output of brms 2.10.0 and up is what was previously referred to as Eff.Sample, the effective sample size. This indexed the number of effective samples in “the centre of the” posterior distribution (i.e., the posterior mean or median). But since we also care about uncertainty in our models, we also care about how well we have characterized the tails of the posterior distribution in 95% intervals and such. The new Tail_ESS in brms output allows us to gauge the effective sample size for those intervals.

The Bulk_ESS and Tail_ESS values were all well above 2,000 with model9.2 and the autocorrelations were very low, too. model9.1 had higher autocorrelations and lower ESS values. The upshot is that even though we have 4,000 samples for each parameter, those samples don’t necessarily give us the same quality of information fully independent samples would. Bulk_ESS and Tail_ESS values help you determine how concerned you should be. And, as it turns out, things like centering can help increase a models Bulk_ESS and Tail_ESS values.

Wading in further, we can use the neff_ratio() function to collect the $$n_{eff}$$ to $$N$$ ratio for each model parameter and then use mcmc_neff() to make a visual diagnostic. Here we do so for model9.1 and model9.2.

ratios_model9.1 <-
neff_ratio(model9.1,
pars = c("b_Intercept", "b_negemot", "b_age", "b_negemot:age", "sigma"))

ratios_model9.2 <-
neff_ratio(model9.2,
pars = c("b_Intercept", "b_negemot_c", "b_age_c", "b_negemot_c:age_c", "sigma"))

mcmc_neff(ratios_model9.1) +
yaxis_text(hjust = 0) +
theme_xkcd()

mcmc_neff(ratios_model9.2) +
yaxis_text(hjust = 0) +
theme_xkcd()

Although none of the $$n_{eff}$$ to $$N$$ ratios were in the shockingly-low range for either model, there were substantially higher for model9.2.

In addition to autocorrelations and $$n_{eff}$$ to $$N$$ ratios, there is also the issue that the parameters in the model can themselves be correlated. If you like a visual approach, you can use brms::pairs() to retrieve histograms for each parameter along with scatter plots showing the shape of their correlations. Here we’ll use the off_diag_args argument to customize some of the plot settings.

pairs(model9.1,
off_diag_args = list(size = 1/10,
alpha = 1/5))

pairs(model9.2,
off_diag_args = list(size = 1/10,
alpha = 1/5))

When fitting models with HMC, centering can make a difference for the parameter correlations. If you prefer a more numeric approach, vcov() will yield the variance/covariance matrix–or correlation matrix when using correlation = T–for the parameters in a model.

vcov(model9.1, correlation = T) %>% round(digits = 2)
##             Intercept negemot   age negemot:age
## Intercept        1.00   -0.93 -0.96        0.89
## negemot         -0.93    1.00  0.89       -0.96
## age             -0.96    0.89  1.00       -0.93
## negemot:age      0.89   -0.96 -0.93        1.00
vcov(model9.2, correlation = T) %>% round(digits = 2)
##                 Intercept negemot_c age_c negemot_c:age_c
## Intercept            1.00      0.02  0.02            0.05
## negemot_c            0.02      1.00  0.04           -0.10
## age_c                0.02      0.04  1.00            0.02
## negemot_c:age_c      0.05     -0.10  0.02            1.00

And so wait, what does that even mean for a parameter to correlate with another parameter? you might ask. Fair enough. Let’s compute a parameter correlation step by step. The first step requires posterior_samples().

post <- posterior_samples(model9.1)

head(post)
##   b_Intercept b_negemot       b_age b_negemot:age    sigma      lp__
## 1    4.204010 0.1395087 -0.03271575   0.008387642 1.126351 -1236.248
## 2    4.022693 0.2058020 -0.02801013   0.007128736 1.119916 -1235.708
## 3    4.344482 0.1919790 -0.02670543   0.005164702 1.108933 -1236.080
## 4    4.083391 0.2073658 -0.02436273   0.005575253 1.102304 -1232.871
## 5    3.934765 0.2173567 -0.02365274   0.006039011 1.111398 -1233.408
## 6    4.225579 0.1693370 -0.02878213   0.006922315 1.118568 -1232.815

Now we’ve put our posterior iterations into a data object, post, we can make a scatter plot of two parameters. Here we’ll choose b_negemot and the interaction coefficient, b_negemot:age.

post %>%
ggplot(aes(x = b_negemot, y = b_negemot:age)) +
geom_point(size = 1/10, alpha = 1/5) +
labs(subtitle = "Each dot is the parameter pair\nfrom a single iteration. Looking\nacross the 4,000 total posterior\niterations, it becomes clear the\ntwo parameters are highly\nnegatively correlated.") +
theme_xkcd()

And indeed, the Pearson’s correlation coefficient is strong.

cor(post$b_negemot, post$b_negemot:age)
## [1] -0.9561894

And what was that part from the vcov() output, again?

vcov(model9.1, correlation = T)["negemot", "negemot:age"]
## [1] -0.9561894

Boom! That’s where those correlations come from.

This entire topic of HMC diagnostics can seem baffling, especially when compared to the simplicity of OLS. If this is your first introduction, you might want to watch lectures 10 and 11 from McElreath’s Statistical Rethinking Fall 2017 lecture series. Accordingly, you might check out Chapter 8 of his (2015) text, Chapter 9 of his (2020) text, and either of my ebooks translating his work into brms and tidyverse code .

### 9.1.1 The effect of mean-centering on multicollinearity and the standard error of $$b_3$$.

This can be difficult to keep track of, but what we just looked at were the correlations among model parameters. These are not the same as correlations among variables. As such, those correlations are not the same as those in Table 9.2. But we can get those, too. First we’ll have to do a little more data processing to get all the necessary mean-centered variables and standardized variables.

glbwarm <-
glbwarm %>%
mutate(negemot_x_age     = negemot   * age,
negemot_c_x_age_c = negemot_c * age_c,
negemot_z         = (negemot - mean(negemot)) / sd(negemot),
age_z             = (age     - mean(age)    ) / sd(age)) %>%
mutate(negemot_z_x_age_z = negemot_z * age_z)

And recall that to get our sweet Bayesian correlations, we use the multivariate mvbind() syntax to fit an intercepts-only model. Here we do that for all three of the Table 9.2 sections.

model9.3 <-
brm(data = glbwarm,
family = gaussian,
mvbind(negemot, age, negemot_x_age) ~ 1,
cores = 4,
file = "fits/model09.03")

model9.4 <-
brm(data = glbwarm,
family = gaussian,
mvbind(negemot_c, age_c, negemot_c_x_age_c) ~ 1,
cores = 4,
file = "fits/model09.04")

model9.5 <-
brm(data = glbwarm,
family = gaussian,
mvbind(negemot_z, age_z, negemot_z_x_age_z) ~ 1,
cores = 4,
file = "fits/model09.05")

Here are their summaries.

print(model9.3, digits = 3)
##  Family: MV(gaussian, gaussian, gaussian)
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity
##          mu = identity; sigma = identity
## Formula: negemot ~ 1
##          age ~ 1
##          negemot_x_age ~ 1
##    Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
##
## Population-Level Effects:
##                       Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## negemot_Intercept        3.559     0.052    3.455    3.663 1.000     3469     2851
## age_Intercept           49.537     0.560   48.434   50.624 1.001     3714     2754
## negemotxage_Intercept  174.886     3.339  168.262  181.347 1.000     3182     2753
##
## Family Specific Parameters:
##                   Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## sigma_negemot        1.530     0.038    1.457    1.605 1.002     3370     2706
## sigma_age           16.356     0.392   15.603   17.143 1.000     3138     2980
## sigma_negemotxage   97.456     2.411   92.871  102.218 1.001     2711     2347
##
## Residual Correlations:
##                             Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## rescor(negemot,age)           -0.059     0.035   -0.125    0.009 1.001     3118     2775
## rescor(negemot,negemotxage)    0.765     0.014    0.737    0.792 1.001     2726     2673
## rescor(age,negemotxage)        0.547     0.024    0.499    0.593 1.001     3496     2692
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
print(model9.4, digits = 3)
##  Family: MV(gaussian, gaussian, gaussian)
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity
##          mu = identity; sigma = identity
## Formula: negemot_c ~ 1
##          age_c ~ 1
##          negemot_c_x_age_c ~ 1
##    Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
##
## Population-Level Effects:
##                         Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## negemotc_Intercept         0.000     0.054   -0.106    0.105 1.001     6636     2853
## agec_Intercept            -0.004     0.578   -1.156    1.160 1.003     7940     2825
## negemotcxagec_Intercept   -1.432     0.850   -3.047    0.230 1.001     7685     3111
##
## Family Specific Parameters:
##                     Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## sigma_negemotc         1.532     0.038    1.460    1.609 1.001     8012     3014
## sigma_agec            16.368     0.393   15.632   17.156 1.002     6283     3137
## sigma_negemotcxagec   24.244     0.598   23.113   25.426 1.003     7169     2776
##
## Residual Correlations:
##                                Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## rescor(negemotc,agec)            -0.056     0.035   -0.123    0.011 1.000     6991     2919
## rescor(negemotc,negemotcxagec)    0.092     0.035    0.025    0.159 1.000     8046     3094
## rescor(agec,negemotcxagec)       -0.015     0.034   -0.082    0.052 1.000     7731     2990
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
print(model9.5, digits = 3)
##  Family: MV(gaussian, gaussian, gaussian)
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity
##          mu = identity; sigma = identity
## Formula: negemot_z ~ 1
##          age_z ~ 1
##          negemot_z_x_age_z ~ 1
##    Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
##
## Population-Level Effects:
##                         Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## negemotz_Intercept         0.000     0.036   -0.070    0.071 1.001     7483     3056
## agez_Intercept             0.000     0.035   -0.068    0.068 1.001     7461     3072
## negemotzxagez_Intercept   -0.056     0.034   -0.124    0.010 1.001     7117     3106
##
## Family Specific Parameters:
##                     Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## sigma_negemotz         1.002     0.025    0.955    1.052 1.008     8670     3322
## sigma_agez             1.003     0.025    0.955    1.054 1.001     7666     2614
## sigma_negemotzxagez    0.972     0.023    0.926    1.018 1.001     7455     3242
##
## Residual Correlations:
##                                Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## rescor(negemotz,agez)            -0.057     0.034   -0.123    0.012 1.000     8085     3080
## rescor(negemotz,negemotzxagez)    0.093     0.035    0.021    0.161 1.000     8273     2684
## rescor(agez,negemotzxagez)       -0.015     0.036   -0.085    0.056 1.002     7400     2550
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

A more condensed way to get that information might be with the brms::VarCorr() function. Just make sure to tack $residual__$cor onto the end.

VarCorr(model9.3)$residual__$cor %>%
round(digits = 3)
## , , negemot
##
##             Estimate Est.Error   Q2.5 Q97.5
## negemot        1.000     0.000  1.000 1.000
## age           -0.059     0.035 -0.125 0.009
## negemotxage    0.765     0.014  0.737 0.792
##
## , , age
##
##             Estimate Est.Error   Q2.5 Q97.5
## negemot       -0.059     0.035 -0.125 0.009
## age            1.000     0.000  1.000 1.000
## negemotxage    0.547     0.024  0.499 0.593
##
## , , negemotxage
##
##             Estimate Est.Error  Q2.5 Q97.5
## negemot        0.765     0.014 0.737 0.792
## age            0.547     0.024 0.499 0.593
## negemotxage    1.000     0.000 1.000 1.000

For the sake of space, I’ll let you check that out for model9.4 and model9.5. If you’re tricky with your VarCorr() indexing, you can also get the model-implied variances.

VarCorr(model9.3)$residual__$cov[1, , "negemot"] %>% round(digits = 3)
##  Estimate Est.Error      Q2.5     Q97.5
##     2.342     0.117     2.123     2.577
VarCorr(model9.3)$residual__$cov[2, , "age"] %>% round(digits = 3)
##  Estimate Est.Error      Q2.5     Q97.5
##   267.685    12.848   243.455   293.891
VarCorr(model9.3)$residual__$cov[3, , "negemotxage"] %>% round(digits = 3)
##  Estimate Est.Error      Q2.5     Q97.5
##  9503.402   470.181  8624.944 10448.492

And if you’re like totally lost with all this indexing, you might execute VarCorr(correlations1) %>% str() and spend a little time looking at what VarCorr() returns.

On page 309, Hayes explained why the OLS variance for $$b_3$$ is unaffected by mean centering. The story was similar for our HMC model, too.

fixef(model9.1)["negemot:age", "Est.Error"]
## [1] 0.001599196
fixef(model9.2)["negemot_c:age_c", "Est.Error"]
## [1] 0.001600323

For more details, you might also see the Standardizing predictors and outputs subsection of Chapter 22 in the Stan user’s guide Stan, of course, being the computational engine underneath our brms hood.

### 9.1.2 The effect of mean-centering on $$b_1$$, $$b_2$$, and their standard errors posterior $$SD$$s.

A second explanation given for why mean-centering is preferred is that it makes $$b_1$$ and $$b_2$$, the regression coefficients for $$X$$ and $$W$$, more meaningful. This is generally true and thus not a myth, although it is not necessarily true in all circumstances… .

Mean-centering $$X$$ and $$W$$ prior to computation of the product and estimation of the model will produce $$b_1$$ and $$b_2$$ that are always meaningful, rather than meaningful only when $$X$$ and/or $$W$$ are meaningful when equal to zero… [After mean centering,] $$b_1$$ estimates the difference in $$Y$$ between two cases that differ by one unit on $$X$$ among cases that are average on $$W$$. Similarly, $$b_2$$ estimates the difference in $$Y$$ between two cases that differ by one unit on $$W$$ among cases that are average on $$X$$. These will always estimate conditional effects of $$X$$ on $$Y$$ within the range of the data, and they can always be interpreted. (p. 310. emphasis in the original)

If you only care about posterior means, you can use model9.1 to reproduce the results at the bottom of page 310 like this.

fixef(model9.1)["negemot", 1] +
fixef(model9.1)["negemot:age", 1] * mean(glbwarm$age) ## [1] 0.5008785 Here’s the same computation using model9.2. fixef(model9.2)["negemot_c", 1] + fixef(model9.2)["negemot_c:age_c", 1] * mean(glbwarm$age_c)
## [1] 0.5007536

But we’re proper Bayesians and like a summary of the spread in the posterior. So we’ll evoke posterior_samples() and the other usual steps, this time just focusing on model9.1.

post <- posterior_samples(model9.1)

post %>%
transmute(our_contidional_effect_given_W_bar = b_negemot + b_negemot:age * mean(glbwarm$age)) %>% summarize(mean = mean(our_contidional_effect_given_W_bar), sd = sd(our_contidional_effect_given_W_bar)) %>% round(digits = 3) ## mean sd ## 1 0.501 0.025 And note how the standard error Hayes computed at the top of page 311 corresponds nicely with the posterior $$SD$$ we just computed. Hayes employed a fancy formula; we just used sd(). At any rate, the main message is centering did not effect our estimate of the conditional effect of $$X$$. It turns out $$(\theta_{X \rightarrow Y} | W) = \left (\theta_{\overline X \rightarrow Y} | \overline W \right)$$. ### 9.1.3 The centering option in PROCESS. I’m not aware of a similar function in brms. You’ll have to use your data wrangling skills. ## 9.2 The estimation and interpretation of standardized regression coefficients in a moderation analysis Mean-centering does nothing to change the scaling of regression coefficients. Whether or not mean-centering is used when estimating a model of the form $$\hat Y = i_Y + b_1 X + b_2 W + b_3 XW$$, $$b_1$$, $$b_2$$, and $$b_3$$ are interpreted with respect to the measured metrics of $$X$$, $$W$$, and $$Y$$ (i.e., in unstandardized form). Although [Hayes] generally prefer[s] to report and interpret regression analyses based on unstandardized coefficients, it is possible to generate regression coefficients that are analogous to standardized regression coefficients in regression models without a product term as a predictor. However, one must be careful when doing so. (p. 313, emphasis in the original) ### 9.2.1 Variant 1. We’ve already computed standardized predictors. Now we just need to standardize the criterion, govact. glbwarm <- glbwarm %>% mutate(govact_z = (govact - mean(govact)) / sd(govact)) Fit the model. model9.6 <- update(model9.1, newdata = glbwarm, govact_z ~ 1 + negemot_z + age_z + negemot_z:age_z, cores = 4, file = "fits/model09.06") Check the Bayesian $$R^2$$. bayes_R2(model9.6) %>% round(digits = 3) ## Estimate Est.Error Q2.5 Q97.5 ## R2 0.354 0.021 0.311 0.395 Check the parameter summaries. print(model9.6, digits = 3) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact_z ~ negemot_z + age_z + negemot_z:age_z ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 0.008 0.029 -0.049 0.064 1.001 4866 3126 ## negemot_z 0.562 0.028 0.507 0.617 1.000 4471 3089 ## age_z -0.064 0.028 -0.118 -0.009 1.001 4991 3308 ## negemot_z:age_z 0.131 0.029 0.076 0.188 1.000 5436 3240 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.807 0.020 0.770 0.847 1.001 5141 3192 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). If you go all the way up back to Table 9.1, you’ll see our results are pretty similar to those in the text. ### 9.2.2 Variant 2. This time we need to standardize our interaction term, negemot_x_age_z, by hand. glbwarm <- glbwarm %>% mutate(negemot_x_age_z = (negemot_x_age - mean(negemot_x_age)) / sd(negemot_x_age)) Now we’re ready to update(). model9.7 <- update(model9.1, newdata = glbwarm, govact_z ~ 1 + negemot_z + age_z + negemot_x_age_z, cores = 4, file = "fits/model09.07") bayes_R2(model9.7) %>% round(digits = 3) ## Estimate Est.Error Q2.5 Q97.5 ## R2 0.354 0.021 0.31 0.394 print(model9.7, digits = 3) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact_z ~ negemot_z + age_z + negemot_x_age_z ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 0.000 0.028 -0.054 0.054 1.002 2801 2483 ## negemot_z 0.166 0.095 -0.015 0.355 1.005 1012 1387 ## age_z -0.367 0.072 -0.507 -0.225 1.006 1046 1335 ## negemot_x_age_z 0.510 0.113 0.284 0.729 1.005 1060 1308 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.807 0.020 0.770 0.848 1.002 2811 2268 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). The results correspond nicely to those in Table 9.1, too. ## 9.3 A caution on manual centering and standardization [because of missing data] It’s worthwhile considering the issue of listwise deletion when data are partially missing. The brms default is to delete rows with missingness, NA in R, for the predictors. However, brms allows users to perform one-step Bayesian imputation for missing values using the mi() syntax. You can learn all the details in Bürkner’s (2021c) vignette, Handle missing values with brms. First we’ll fit see what happens when you fit a model in brms when some of the negemot_z values are missing, but without using the mi() syntax. And of course before we do that, we’ll make a negemot_z_missing variable, which is identical to negemot_z, but about 10% of the values are missing. set.seed(9) glbwarm <- glbwarm %>% mutate(missing = rbinom(n = n(), size = 1, prob = .1)) %>% mutate(negemot_z_missing = ifelse(missing == 1, NA, negemot_z)) If you’ve never used rbinom() before, execute ?rbinom or look it up in your favorite web search engine. Here’s our listwise deletion model, which corresponds to what you’d get from a typical OLS-based program. model9.8 <- update(model9.6, newdata = glbwarm, govact_z ~ 1 + negemot_z_missing + age_z + negemot_z_missing:age_z, cores = 4, file = "fits/model09.08") Let’s compare the listwise deletion results with the model based on all the data. print(model9.6) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact_z ~ negemot_z + age_z + negemot_z:age_z ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 0.01 0.03 -0.05 0.06 1.00 4866 3126 ## negemot_z 0.56 0.03 0.51 0.62 1.00 4471 3089 ## age_z -0.06 0.03 -0.12 -0.01 1.00 4991 3308 ## negemot_z:age_z 0.13 0.03 0.08 0.19 1.00 5436 3240 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.81 0.02 0.77 0.85 1.00 5141 3192 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). print(model9.8) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact_z ~ negemot_z_missing + age_z + negemot_z_missing:age_z ## Data: glbwarm (Number of observations: 731) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 0.00 0.03 -0.05 0.06 1.00 4880 3042 ## negemot_z_missing 0.57 0.03 0.51 0.63 1.00 4801 2924 ## age_z -0.05 0.03 -0.10 0.01 1.00 5129 3372 ## negemot_z_missing:age_z 0.12 0.03 0.06 0.18 1.00 4392 3052 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 0.81 0.02 0.77 0.85 1.00 5420 3168 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). In this case, the model results were similar to those based on all the data because we used rbinom() to delete the predictor values completely at random. With real data and real-live missing data mechanisms, the situation isn’t often so rosy. But anyway, the real story, here, is the Data: glbwarm (Number of observations: n) line at the top of the print() outputs. The number, $$n$$, was 815 in the model using all the data and 731 for the one based on listwise deletion. That’s a lot of missing information. The mi() syntax will allow us to use all the rows in a model, even if one or more of the predictors contain missing values. The syntax makes the model a multivariate model in that now we’ll be modeling both govact_z and negemot_z_missing. As we’ve covered in prior chapters, there are multiple ways to write a multivariate model in brms (see Bürkner, 2021b). One nice way is to write the model for each criterion separately in a bf() statement. You combine the bf() statements together with the + operator. And for models like the ones in Hayes’s text, you’ll also want to tack on set_rescor(FALSE). You can do this within the brm() function, as usual. But I find that this clutters the code up more than I like. So another approach is to save the combination of bf() statements as an object. my_model <- bf(govact_z ~ 1 + mi(negemot_z_missing) + age_z + mi(negemot_z_missing):age_z) + bf(negemot_z_missing | mi() ~ 1) + set_rescor(FALSE) With our multivariate formula saved as my_model, we’re ready to plug it into brm() and fit. model9.9 <- brm(data = glbwarm, family = gaussian, my_model, cores = 4, file = "fits/model09.09") Let’s see what we’ve done. print(model9.9) ## Family: MV(gaussian, gaussian) ## Links: mu = identity; sigma = identity ## mu = identity; sigma = identity ## Formula: govact_z ~ 1 + mi(negemot_z_missing) + age_z + mi(negemot_z_missing):age_z ## negemot_z_missing | mi() ~ 1 ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## govactz_Intercept 0.01 0.03 -0.05 0.06 1.00 5485 3096 ## negemotzmissing_Intercept -0.00 0.04 -0.07 0.07 1.00 4977 3135 ## govactz_age_z -0.06 0.03 -0.12 -0.00 1.00 5341 3035 ## govactz_minegemot_z_missing 0.57 0.03 0.51 0.63 1.00 6042 2249 ## govactz_minegemot_z_missing:age_z 0.13 0.03 0.07 0.19 1.00 5485 3189 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma_govactz 0.81 0.02 0.77 0.85 1.00 5655 3133 ## sigma_negemotzmissing 1.00 0.03 0.95 1.05 1.00 5181 2974 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). When using the multivariate mi() syntax, your print() output becomes more complicated. We got regression sub-models for both govact_z and negemot_z_missing. But look at the line at the top of the output that reads “Data: glbwarm (Number of observations: 815).” The multivariate mi() syntax used all the available data. No listwise deletion necessary. The print() output for our model obscured some of the results. To clarify what the mi() syntax did, let’s peek at the first columns returned by posterior_samples(). post <- posterior_samples(model9.9) post[, 1:20] %>% glimpse() ## Rows: 4,000 ## Columns: 20 ##$ b_govactz_Intercept                     <dbl> -0.0009375754, 0.0102454390, 0.0267679206, 0.0004908605…
## $b_negemotzmissing_Intercept <dbl> 0.0421679586, 0.0028051301, -0.0258047227, 0.0319523969… ##$ b_govactz_age_z                         <dbl> 0.0038097886, 0.0055311310, 0.0101281349, -0.0306100074…
## $bsp_govactz_minegemot_z_missing <dbl> 0.5438757, 0.5536677, 0.5578784, 0.6028770, 0.5446027, … ##$ bsp_govactz_minegemot_z_missing:age_z <dbl> 0.11819682, 0.14527529, 0.13840587, 0.18422013, 0.06941…
## $sigma_govactz <dbl> 0.8580747, 0.8185694, 0.7947336, 0.7994697, 0.8124083, … ##$ sigma_negemotzmissing                   <dbl> 0.9785131, 0.9972329, 1.0021417, 1.0165915, 0.9822670, …
## $Ymi_negemotzmissing[10] <dbl> -1.290512603, -1.043112362, -0.830640971, -0.873447449,… ##$ Ymi_negemotzmissing[18]               <dbl> -0.98932629, -0.79446124, -1.61039075, -1.01502925, -0.…
## $Ymi_negemotzmissing[26] <dbl> 0.334853667, 0.417298092, 0.588396261, 0.899239245, 0.0… ##$ Ymi_negemotzmissing[29]               <dbl> 1.326713180, -0.460480411, -0.370049685, 1.001615849, -…
## $Ymi_negemotzmissing[32] <dbl> -0.33018069, 0.18344865, -1.25270921, -1.82373863, 0.32… ##$ Ymi_negemotzmissing[34]               <dbl> -0.46277395, -0.15870952, -1.82401414, 0.17684719, 0.35…
## $Ymi_negemotzmissing[38] <dbl> -1.7951726, -1.0863270, -0.9174883, -1.1077899, -2.0922… ##$ Ymi_negemotzmissing[41]               <dbl> 1.14102770, -0.68278622, 0.22707544, 1.48681521, 0.2399…
## $Ymi_negemotzmissing[50] <dbl> 0.534109352, -1.669759946, -0.296342997, -0.685547859, … ##$ Ymi_negemotzmissing[51]               <dbl> -0.46777446, 0.06965386, 0.34232446, -0.17051913, -0.62…
## $Ymi_negemotzmissing[56] <dbl> 0.016301128, 0.665334352, 1.242373177, -1.088473074, 0.… ##$ Ymi_negemotzmissing[78]               <dbl> -0.38659424, 0.76295879, 0.51827930, -0.18174219, 0.395…
## $Ymi_negemotzmissing[89] <dbl> 1.02044743, 0.28167339, -0.61432986, -0.38829197, -0.32… Columns b_govactz_Intercept through sigma_negemotzmissing were business as usual. But notice all the Ymi_negemotzmissing[i] columns. In each of these we see 4,000 posterior draws for the missing negemot_z_missing values. The [i] part of the column names indexes which row number the iterations correspond to. Summarizing these columns might help us get a sense of the results. library(tidybayes) post %>% pivot_longer(starts_with("Ymi"), names_to = "row") %>% group_by(row) %>% # yep, that's right, we're summarizing as usual mean_qi(value) %>% mutate_if(is.double, round, digits = 2) %>% select(row:.upper) ## # A tibble: 84 x 4 ## row value .lower .upper ## <chr> <dbl> <dbl> <dbl> ## 1 Ymi_negemotzmissing[10] -0.86 -2.43 0.62 ## 2 Ymi_negemotzmissing[101] 0.15 -1.51 1.82 ## 3 Ymi_negemotzmissing[117] -0.17 -1.85 1.53 ## 4 Ymi_negemotzmissing[135] 0.26 -1.29 1.8 ## 5 Ymi_negemotzmissing[139] -0.34 -1.92 1.22 ## 6 Ymi_negemotzmissing[148] -1.38 -2.81 0.06 ## 7 Ymi_negemotzmissing[150] -0.3 -1.75 1.12 ## 8 Ymi_negemotzmissing[152] 0 -1.57 1.62 ## 9 Ymi_negemotzmissing[153] -1.59 -3.12 -0.02 ## 10 Ymi_negemotzmissing[154] 0.57 -0.83 2.01 ## 11 Ymi_negemotzmissing[162] 0.09 -1.53 1.72 ## 12 Ymi_negemotzmissing[164] 0.39 -1.11 1.89 ## 13 Ymi_negemotzmissing[166] -0.06 -1.52 1.41 ## 14 Ymi_negemotzmissing[167] -0.15 -1.59 1.28 ## 15 Ymi_negemotzmissing[178] -0.51 -1.98 0.98 ## 16 Ymi_negemotzmissing[18] -0.88 -2.33 0.55 ## 17 Ymi_negemotzmissing[186] 0.12 -1.35 1.62 ## 18 Ymi_negemotzmissing[214] 0.95 -0.53 2.46 ## 19 Ymi_negemotzmissing[230] -0.17 -1.79 1.48 ## 20 Ymi_negemotzmissing[246] 0.6 -1.12 2.27 ## 21 Ymi_negemotzmissing[259] -0.98 -2.61 0.74 ## 22 Ymi_negemotzmissing[26] 0.59 -0.87 2.02 ## 23 Ymi_negemotzmissing[260] 0.49 -0.9 1.89 ## 24 Ymi_negemotzmissing[266] 0.23 -1.2 1.62 ## 25 Ymi_negemotzmissing[284] 0.17 -1.6 1.97 ## 26 Ymi_negemotzmissing[29] 0.27 -1.22 1.79 ## 27 Ymi_negemotzmissing[32] -0.48 -2.21 1.23 ## 28 Ymi_negemotzmissing[329] 0.22 -1.54 2 ## 29 Ymi_negemotzmissing[337] 0.25 -1.47 2 ## 30 Ymi_negemotzmissing[338] 0.16 -1.71 2 ## 31 Ymi_negemotzmissing[34] 0.26 -1.32 1.85 ## 32 Ymi_negemotzmissing[350] -0.92 -2.56 0.73 ## 33 Ymi_negemotzmissing[358] 0.96 -0.66 2.64 ## 34 Ymi_negemotzmissing[371] 0.16 -1.31 1.62 ## 35 Ymi_negemotzmissing[38] -1.56 -2.98 -0.1 ## 36 Ymi_negemotzmissing[383] 0.47 -1.1 2.07 ## 37 Ymi_negemotzmissing[394] 0.74 -0.76 2.33 ## 38 Ymi_negemotzmissing[397] 0.33 -1.41 2.03 ## 39 Ymi_negemotzmissing[398] 0.56 -1.07 2.25 ## 40 Ymi_negemotzmissing[401] -0.26 -1.99 1.52 ## 41 Ymi_negemotzmissing[406] 0.62 -0.98 2.21 ## 42 Ymi_negemotzmissing[41] 0.61 -0.97 2.2 ## 43 Ymi_negemotzmissing[424] 0.71 -0.97 2.38 ## 44 Ymi_negemotzmissing[435] -0.1 -1.91 1.69 ## 45 Ymi_negemotzmissing[440] 0.35 -1.16 1.87 ## 46 Ymi_negemotzmissing[447] -0.17 -1.75 1.45 ## 47 Ymi_negemotzmissing[469] -0.05 -1.85 1.77 ## 48 Ymi_negemotzmissing[472] 0.42 -1.02 1.91 ## 49 Ymi_negemotzmissing[476] 0.65 -0.82 2.16 ## 50 Ymi_negemotzmissing[484] -1.39 -2.9 0.11 ## 51 Ymi_negemotzmissing[50] -0.2 -1.57 1.15 ## 52 Ymi_negemotzmissing[507] -0.15 -1.72 1.44 ## 53 Ymi_negemotzmissing[51] -0.51 -2 1 ## 54 Ymi_negemotzmissing[515] 0.21 -1.27 1.65 ## 55 Ymi_negemotzmissing[519] -0.05 -1.54 1.5 ## 56 Ymi_negemotzmissing[524] -0.33 -1.81 1.22 ## 57 Ymi_negemotzmissing[540] 0.48 -1.18 2.17 ## 58 Ymi_negemotzmissing[555] 0.45 -1.09 2.03 ## 59 Ymi_negemotzmissing[56] 0 -1.61 1.61 ## 60 Ymi_negemotzmissing[585] 0.07 -1.36 1.5 ## 61 Ymi_negemotzmissing[593] 0.12 -1.67 1.85 ## 62 Ymi_negemotzmissing[602] -0.58 -2.06 0.93 ## 63 Ymi_negemotzmissing[605] 0.75 -0.89 2.38 ## 64 Ymi_negemotzmissing[612] 0.13 -1.5 1.69 ## 65 Ymi_negemotzmissing[613] 0.71 -0.88 2.36 ## 66 Ymi_negemotzmissing[621] -0.77 -2.25 0.73 ## 67 Ymi_negemotzmissing[629] 0.11 -1.4 1.6 ## 68 Ymi_negemotzmissing[630] 0.39 -1.07 1.88 ## 69 Ymi_negemotzmissing[643] 0.9 -0.75 2.58 ## 70 Ymi_negemotzmissing[653] -0.63 -2.32 1.08 ## 71 Ymi_negemotzmissing[679] -0.91 -2.59 0.78 ## 72 Ymi_negemotzmissing[701] -0.16 -1.72 1.32 ## 73 Ymi_negemotzmissing[742] 0.24 -1.42 1.9 ## 74 Ymi_negemotzmissing[751] 0.29 -1.36 1.96 ## 75 Ymi_negemotzmissing[752] -0.86 -2.25 0.58 ## 76 Ymi_negemotzmissing[760] 0.42 -1.17 2.05 ## 77 Ymi_negemotzmissing[774] 0.82 -0.95 2.6 ## 78 Ymi_negemotzmissing[779] -0.02 -1.77 1.71 ## 79 Ymi_negemotzmissing[78] 0.1 -1.56 1.8 ## 80 Ymi_negemotzmissing[792] -0.11 -2 1.67 ## 81 Ymi_negemotzmissing[794] 0.29 -1.39 1.97 ## 82 Ymi_negemotzmissing[799] -0.18 -1.89 1.53 ## 83 Ymi_negemotzmissing[89] -0.78 -2.28 0.79 ## 84 Ymi_negemotzmissing[94] 0.27 -1.29 1.76 That’s a lot of output. Here’s what those summaries look like in a coefficient plot. # summarize just like before post %>% pivot_longer(starts_with("Ymi"), names_to = "row") %>% group_by(row) %>% mean_qi(value) %>% # plot! ggplot(aes(x = row, y = value, ymin = .lower, ymax = .upper)) + geom_hline(yintercept = 0, color = "grey75") + geom_pointinterval(size = 1/3) + scale_x_discrete("rank-ordered row number", breaks = NULL) + ylab("imputed value") + theme_xkcd() Each missing negemot_z value got an entire posterior distribution. And just as the model is uncertain about what those values might have been, that uncertainty was baked right into the primary submodel predicting govact_z. That’s a good thing. We want our Bayesian models to use as much information as they can and yield results with as much certainty as possible. But we don’t want our models to be more certain than the data–and priors–allow. When you use listwise deletion methods, you leave information on the table, which we don’t want. But when you use old fashioned ad hock methods like mean imputation, you underestimate the uncertainty in the model, which we also don’t want. We want the middle path. Here’s a focused look at two other important new parameters. posterior_summary(model9.9)[c("b_negemotzmissing_Intercept", "sigma_negemotzmissing"),] %>% round(digits = 3) ## Estimate Est.Error Q2.5 Q97.5 ## b_negemotzmissing_Intercept -0.001 0.036 -0.071 0.070 ## sigma_negemotzmissing 0.997 0.026 0.948 1.048 Our model has estimated the mean and standard deviations for our negemot_z_missing variable. Hopefully that isn’t a surprise. This is exactly what we asked brms to do with the negemot_z_missing | mi() ~ 1 part of the model formula. Since that submodel had no predictors, the intercept was just the mean. Correspondingly, the residual variance was the entire variance–but expressed in the usual $$\sigma$$ metric since we’re using brms. And since our negemot_z_missing variable is a subset of the standardized negemot_z variable, naturally the estimates for the mean and standard deviation are about 0 and 1, respectively. Another method we could have used is multiple imputation. With multiple imputation, you create a small number of alternative data sets, typically 5, into which you impute plausible values into the missing value slots. You then fit the model separately for each of the imputed data sets. Because the imputed values will vary across the data sets, that uncertainty will get appropriately transmitted to the model. I know this might sound crazy, but it typically leads to much lower model bias when compared to mean imputation or listwise deletion. But we didn’t quite use multiple imputation. With one-step Bayesian imputation using the mi() syntax, you get an entire posterior distribution for each missing value. And if you have variables in the data set that might help predict what those missing values are, you’d just plug that into the missing data submodel. Improving the imputation model can improve the subsequent substantive model. For more on the mi() approach, see Bürkner’s (2021c) vignette. McElreath lectured on this topic (here and here) within the context of his Statistical rethinking texts , and you can find links to my translations of both of his texts here. For a more general introduction to missing data theory, check out van Burren’s (2018) book, Flexible imputation of missing data, or Enders’ great (2010) text, Applied missing data analysis. You can also find Enders lecturing on missing data here. The take home message is there is no need to ignore missing data or use outdated procedures like listwise deletion. Be a champion and model your missing data with brms. ## 9.4 More than one moderator None of this is a problem for brms. But instead of using the model=i syntax in Hayes’s PROCESS, you just have to specify your model formula in brm(). ### 9.4.1 Additive multiple moderation. It’s trivial to add sex, its interaction with negemot, and the two covariates (i.e., posemot and ideology) to the model. We can even do it within update(). model9.10 <- update(model9.1, newdata = glbwarm, govact ~ 1 + negemot + sex + age + posemot + ideology + negemot:sex + negemot:age, cores = 4, file = "fits/model09.10") Our output matches nicely with the formula at the bottom of page 232 and the PROCESS output in Figure 9.2. print(model9.10, digits = 3) ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: govact ~ negemot + sex + age + posemot + ideology + negemot:sex + negemot:age ## Data: glbwarm (Number of observations: 815) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 5.268 0.329 4.608 5.897 1.001 2565 2597 ## negemot 0.094 0.080 -0.062 0.253 1.002 2270 2435 ## sex -0.748 0.196 -1.136 -0.368 1.000 2116 2496 ## age -0.018 0.006 -0.030 -0.006 1.002 2145 2694 ## posemot -0.023 0.028 -0.077 0.031 1.001 3329 2699 ## ideology -0.207 0.027 -0.259 -0.156 1.000 4130 3007 ## negemot:sex 0.206 0.051 0.106 0.306 1.000 2082 2598 ## negemot:age 0.005 0.002 0.002 0.008 1.002 2120 2469 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 1.048 0.026 0.998 1.100 1.000 3829 2909 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). On page 325, Hayes discussed the unique variance each of the two moderation terms accounted for after controlling for the other covariates. In order to get our Bayesian version of these, we’ll have to fit two additional models, one after removing each of the interaction terms. model9.11 <- update(model9.10, newdata = glbwarm, govact ~ 1 + negemot + sex + age + posemot + ideology + negemot:sex, cores = 4, file = "fits/model09.11") model9.12 <- update(model9.10, newdata = glbwarm, govact ~ 1 + negemot + sex + age + posemot + ideology + negemot:age, cores = 4, file = "fits/model09.12") Here we’ll extract the bayes_R2() iterations for each of the three models, place them all in a single tibble, and then do a little arithmetic to get the difference scores. After all that data wrangling, we’ll summarize() as usual. r2 <- tibble(r2_without_age_interaction = bayes_R2(model9.11, summary = F)[, 1], r2_without_sex_interaction = bayes_R2(model9.12, summary = F)[, 1], r2_with_both_interactions = bayes_R2(model9.10, summary = F)[, 1]) %>% mutate(delta R2 due to age interaction = r2_with_both_interactions - r2_without_age_interaction, delta R2 due to sex interaction = r2_with_both_interactions - r2_without_sex_interaction) r2 %>% pivot_longer(delta R2 due to age interaction:delta R2 due to sex interaction) %>% group_by(name) %>% mean_qi(value) %>% mutate_if(is.double, round, digits = 3) ## # A tibble: 2 x 7 ## name value .lower .upper .width .point .interval ## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> ## 1 delta R2 due to age interaction 0.007 -0.051 0.067 0.95 mean qi ## 2 delta R2 due to sex interaction 0.014 -0.044 0.069 0.95 mean qi Recall that $$R^2$$ is in a 0-to-1 metric. It’s a proportion. If you want to convert that to a percentage, as in percent of variance explained, you’d just multiply by 100. To make it explicit, let’s do that. r2 %>% pivot_longer(delta R2 due to age interaction:delta R2 due to sex interaction) %>% group_by(name) %>% mutate(percent = value * 100) %>% mean_qi(percent) %>% mutate_if(is.double, round, digits = 3) ## # A tibble: 2 x 7 ## name percent .lower .upper .width .point .interval ## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> ## 1 delta R2 due to age interaction 0.747 -5.08 6.73 0.95 mean qi ## 2 delta R2 due to sex interaction 1.37 -4.39 6.88 0.95 mean qi Hopefully it’s clear how our proportions turned percentages correspond to the numbers on page 325. However, note how our 95% credible intervals do not cohere with the $$p$$-values from Hayes’s $$F$$-tests. If we want to prep for our version of Figure 9.3, we’ll need to carefully specify the predictor values we’ll pass through the fitted() function. Here we do so and save them in nd. nd <- crossing(negemot = seq(from = .5, to = 6.5, length.out = 30), sex = 0:1) %>% expand(nesting(negemot, sex), age = c(30, 50, 70)) %>% mutate(posemot = mean(glbwarm$posemot),
ideology = mean(glbwarm$ideology)) str(nd) ## tibble [180 × 5] (S3: tbl_df/tbl/data.frame) ##$ negemot : num [1:180] 0.5 0.5 0.5 0.5 0.5 ...
##  $sex : int [1:180] 0 0 0 1 1 1 0 0 0 1 ... ##$ age     : num [1:180] 30 50 70 30 50 70 30 50 70 30 ...
##  $posemot : num [1:180] 3.13 3.13 3.13 3.13 3.13 ... ##$ ideology: num [1:180] 4.08 4.08 4.08 4.08 4.08 ...

With our nd values in hand, we’re ready to make our version of Figure 9.3.

fitted(model9.10, newdata = nd) %>%
data.frame() %>%
bind_cols(nd) %>%
# these lines will make the strip text match with those with Hayes's Figure
mutate(sex = if_else(sex == 0, str_c("Females, W = ", sex),
str_c("Males, W = ", sex)),
age = str_c("Age, Z, = ", age)) %>%

# finally, we plot!
ggplot(aes(x = negemot, group = sex)) +
geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5, fill = sex),
alpha = 1/3, color = "transparent") +
geom_line(aes(y = Estimate, color = sex),
size = 1) +
scale_x_continuous(breaks = 1:6) +
coord_cartesian(xlim = c(1, 6),
ylim = c(3, 6)) +
labs(x = expression("Negative Emotions about Climate Change, "*italic(X)),
y = expression("Support for Government Action to Mitigate Climate Change, "*italic(Y))) +
theme_xkcd() +
theme(legend.position = "top",
legend.title = element_blank()) +
facet_grid(age ~ .)

Recall that the conditional effect of $$X$$ for various values of $$W$$ and $$Z$$ is

$\theta_{X \rightarrow Y} = b_1 + b_4 W + b_5 Z.$

In the terms of model9.10 where sex = $$W$$ and age = $$Z$$, we can restate that as

$\theta_{\text{negemot} \rightarrow \text{govact}} = b_\text{negemot} + b_{\text{negemot} \times \text{sex}} \text{sex} + b_{\text{negemot} \times \text{age}} \text{age}.$

This is easiest to show with posterior_samples() and a little algebra. As in the middle of page 329, here we solve for both sexes and age == 50.

post <- posterior_samples(model9.10)

post %>%
# algebra
mutate(men   = b_negemot + b_negemot:sex * 1 + b_negemot:age * 50,
women = b_negemot + b_negemot:sex * 0 + b_negemot:age * 50) %>%
# more algebra
mutate(men - women = men - women) %>%
pivot_longer(men:men - women) %>%
# this just orders the output
mutate(name = factor(name, levels = c("men", "women", "men - women"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name        value .lower .upper .width .point .interval
##   <fct>       <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 men         0.537  0.467  0.607   0.95 mean   qi
## 2 women       0.331  0.259  0.401   0.95 mean   qi
## 3 men - women 0.206  0.106  0.306   0.95 mean   qi

Switching our discussion to $$b_5$$ (i.e., $$b_{\text{negemot} \times \text{age}}$$), Hayes showed its value for two groups 10 years apart. Here it is for model9.10.

post %>%
# algebra
transmute(10 * b_5   = b_negemot:age * 10) %>%
mean_qi() %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 1 x 6
##   10 * b_5 .lower .upper .width .point .interval
##        <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1      0.047  0.016  0.077   0.95 mean   qi

Further down on page 329, Hayes solved for the conditional effect of negemot for women at 50 versus 30.

post %>%
mutate(women_50 = b_negemot + b_negemot:sex * 0 + b_negemot:age * 50,
women_30 = b_negemot + b_negemot:sex * 0 + b_negemot:age * 30) %>%
mutate(women_50 - women_30 = women_50 - women_30) %>%
pivot_longer(women_50:women_50 - women_30) %>%
mutate(name = factor(name, levels = c("women_50", "women_30", "women_50 - women_30"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name                value .lower .upper .width .point .interval
##   <fct>               <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 women_50            0.331  0.259  0.401   0.95 mean   qi
## 2 women_30            0.236  0.15   0.323   0.95 mean   qi
## 3 women_50 - women_30 0.095  0.032  0.154   0.95 mean   qi

Here it is for men.

post %>%
mutate(men_50 = b_negemot + b_negemot:sex * 1 + b_negemot:age * 50,
men_30 = b_negemot + b_negemot:sex * 1 + b_negemot:age * 30) %>%
mutate(men_50 - men_30 = men_50 - men_30) %>%
pivot_longer(men_50:men_50 - men_30) %>%
mutate(name = factor(name, levels = c("men_50", "men_30", "men_50 - men_30"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name            value .lower .upper .width .point .interval
##   <fct>           <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 men_50          0.537  0.467  0.607   0.95 mean   qi
## 2 men_30          0.442  0.341  0.545   0.95 mean   qi
## 3 men_50 - men_30 0.095  0.032  0.154   0.95 mean   qi

If you look closely, you’ll see women_50 - women_30 is the same as men_50 - men_30.

### 9.4.2 Moderated moderation.

To fit the moderated moderation model in brms, just add to two new interaction terms to the formula.

model9.13 <-
update(model9.10,
newdata = glbwarm,
govact ~ 1 + negemot + sex + age + posemot + ideology +
negemot:sex + negemot:age + sex:age +
negemot:sex:age,
cores = 4,
file = "fits/model09.13")

Check the summary.

print(model9.13, digits = 3)
##  Family: gaussian
##   Links: mu = identity; sigma = identity
## Formula: govact ~ negemot + sex + age + posemot + ideology + negemot:sex + negemot:age + sex:age + negemot:sex:age
##    Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
##
## Population-Level Effects:
##                 Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## Intercept          4.569     0.488    3.599    5.544 1.004     1238     1734
## negemot            0.270     0.119    0.039    0.506 1.004     1256     1608
## sex                0.510     0.648   -0.773    1.767 1.004     1018     1387
## age               -0.004     0.010   -0.022    0.016 1.004     1183     1619
## posemot           -0.021     0.028   -0.077    0.034 1.001     3362     2689
## ideology          -0.205     0.027   -0.257   -0.155 1.001     3377     2708
## negemot:sex       -0.126     0.167   -0.459    0.208 1.004     1072     1447
## negemot:age        0.001     0.002   -0.004    0.006 1.003     1279     1675
## sex:age           -0.025     0.012   -0.049   -0.000 1.004      974     1271
## negemot:sex:age    0.007     0.003    0.000    0.013 1.003     1032     1266
##
## Family Specific Parameters:
##       Estimate Est.Error l-95% CI u-95% CI  Rhat Bulk_ESS Tail_ESS
## sigma    1.047     0.025    0.999    1.097 1.001     3335     2894
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Our print() output matches fairly well with the OLS results on pages 332 and 333. Here’s our new Bayesian $$R^2$$.

bayes_R2(model9.13) %>% round(digits = 3)
##    Estimate Est.Error  Q2.5 Q97.5
## R2    0.417      0.02 0.377 0.453

Because we haven’t changed the predictor variables in the model–just added interactions among them–there’s no need to redo our nd values. Rather, all we need to do is pass them through fitted() based on our new model9.13 and plot. Without further ado, here’s our Figure 9.6.

fitted(model9.13, newdata = nd) %>%
data.frame() %>%
bind_cols(nd) %>%
# these lines will make the strip text match with those with Hayes's Figure
mutate(sex = if_else(sex == 0, str_c("Females, W = ", sex),
str_c("Males, W = ", sex)),
age = str_c("Age, Z, = ", age)) %>%

# behold, Figure 9.6!
ggplot(aes(x = negemot, group = sex)) +
geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5, fill = sex),
alpha = 1/3, color = "transparent") +
geom_line(aes(y = Estimate, color = sex),
size = 1) +
scale_x_continuous(breaks = 1:6) +
coord_cartesian(xlim = c(1, 6),
ylim = c(3, 6)) +
labs(x = expression("Negative Emotions about Climate Change, "*italic(X)),
y = expression("Support for Government Action to Mitigate Climate Change, "*italic(Y))) +
theme_xkcd() +
theme(legend.position = "top",
legend.title = element_blank()) +
facet_grid(age ~ .)

For the pick-a-point values Hayes covered on page 338, recall that when using posterior_sample(), our $$b_4$$ is b_negemot:sex and our $$b_7$$ is b_negemot:sex:age.

post <- posterior_samples(model9.13)

post %>%
mutate(age = 30 = b_negemot:sex + b_negemot:sex:age * 30,
age = 50 = b_negemot:sex + b_negemot:sex:age * 50,
age = 70 = b_negemot:sex + b_negemot:sex:age * 70) %>%
pivot_longer(contains("="), names_to = "theta_XW_on_Y_given") %>%
group_by(theta_XW_on_Y_given) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   theta_XW_on_Y_given value .lower .upper .width .point .interval
##   <chr>               <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 age = 30            0.072 -0.087  0.235   0.95 mean   qi
## 2 age = 50            0.204  0.105  0.301   0.95 mean   qi
## 3 age = 70            0.335  0.171  0.494   0.95 mean   qi

Our method for making a JN technique plot with fitted() way back in Chapter 7 isn’t going to work, here. At least not as far as I can see. Rather, we’re going to have to skillfully manipulate our post object. For those new to R, this might be a little confusing at first. So I’m going to make a crude attempt first and then get more sophisticated.

Crude attempt:

post %>%
mutate(age = 30 = b_negemot:sex + b_negemot:sex:age * 30,
age = 50 = b_negemot:sex + b_negemot:sex:age * 50,
age = 70 = b_negemot:sex + b_negemot:sex:age * 70) %>%
pivot_longer(contains("="), names_to = "theta_XW_on_Y_given") %>%
mutate(theta XW on Y given = str_extract(theta_XW_on_Y_given, "\\d+") %>% as.double()) %>%
group_by(theta XW on Y given) %>%
mean_qi(value) %>%

# the plot
ggplot(aes(x = theta XW on Y given)) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 38.114) +
geom_ribbon(aes(ymin = .lower, ymax = .upper),
alpha = 1/2) +
geom_line(aes(y = value),
size = 1) +
coord_cartesian(xlim = c(20, 85),
ylim = c(-.25, .75)) +
theme_xkcd()

Notice how we just took the code from our pick-a-point analysis and dumped it into a plot. So one obvious approach would be to pick like 30 or 50 age values to plug into transmute() and just do the same thing. If you’re super afraid of coding, that’d be one intuitive but extremely verbose attempt. I have done stuff like that earlier in my R career. There’s no shame in being extremely verbose and redundant if that’s what makes sense. Another way is to think in terms of functions. When we made age = 30 within transmute(), we took a specific age value (i.e., 30) and plugged it into the formula b_negemot:sex + b_negemot:sex:age * i where $$i$$ = 30. And when we made age = 50 we did exactly the same thing but switched out the 30 for a 50. So what we need is a function that will take a range of values for $$i$$, plug them into our b_negemot:sex + b_negemot:sex:age * i formula, and then neatly return the output. A nice base R function for that is sapply().

sapply(15:90, function(i) {
post$b_negemot:sex + post$b_negemot:sex:age * i
}
) %>%
data.frame() %>%
str()
## 'data.frame':    4000 obs. of  76 variables:
##  $X1 : num -0.3971 -0.3357 -0.0531 -0.112 -0.0531 ... ##$ X2 : num  -0.384 -0.3219 -0.0448 -0.1036 -0.0439 ...
##  $X3 : num -0.371 -0.3081 -0.0365 -0.0953 -0.0348 ... ##$ X4 : num  -0.3579 -0.2944 -0.0281 -0.0869 -0.0257 ...
##  $X5 : num -0.3449 -0.2806 -0.0198 -0.0786 -0.0166 ... ##$ X6 : num  -0.33184 -0.26682 -0.01148 -0.07023 -0.00745 ...
##  $X7 : num -0.31879 -0.25306 -0.00316 -0.06187 0.00167 ... ##$ X8 : num  -0.30575 -0.23929 0.00517 -0.05352 0.01079 ...
##  $X9 : num -0.2927 -0.2255 0.0135 -0.0452 0.0199 ... ##$ X10: num  -0.2797 -0.2118 0.0218 -0.0368 0.029 ...
##  $X11: num -0.2666 -0.198 0.0301 -0.0285 0.0382 ... ##$ X12: num  -0.2536 -0.1842 0.0385 -0.0201 0.0473 ...
##  $X13: num -0.2405 -0.1705 0.0468 -0.0118 0.0564 ... ##$ X14: num  -0.22748 -0.1567 0.05511 -0.00341 0.06552 ...
##  $X15: num -0.21443 -0.14293 0.06344 0.00494 0.07464 ... ##$ X16: num  -0.2014 -0.1292 0.0718 0.0133 0.0838 ...
##  $X17: num -0.1883 -0.1154 0.0801 0.0216 0.0929 ... ##$ X18: num  -0.1753 -0.1016 0.0884 0.03 0.102 ...
##  $X19: num -0.1623 -0.0879 0.0967 0.0384 0.1111 ... ##$ X20: num  -0.1492 -0.0741 0.1051 0.0467 0.1202 ...
##  $X21: num -0.1362 -0.0603 0.1134 0.0551 0.1294 ... ##$ X22: num  -0.1231 -0.0466 0.1217 0.0634 0.1385 ...
##  $X23: num -0.1101 -0.0328 0.13 0.0718 0.1476 ... ##$ X24: num  -0.097 -0.019 0.1384 0.0801 0.1567 ...
##  $X25: num -0.08399 -0.00528 0.14668 0.08847 0.16585 ... ##$ X26: num  -0.07094 0.00849 0.15501 0.09682 0.17497 ...
##  $X27: num -0.0579 0.0223 0.1633 0.1052 0.1841 ... ##$ X28: num  -0.0449 0.036 0.1717 0.1135 0.1932 ...
##  $X29: num -0.0318 0.0498 0.18 0.1219 0.2023 ... ##$ X30: num  -0.0188 0.0636 0.1883 0.1302 0.2115 ...
##  $X31: num -0.00572 0.07732 0.19663 0.13858 0.22057 ... ##$ X32: num  0.00733 0.09108 0.20495 0.14693 0.22969 ...
##  $X33: num 0.0204 0.1048 0.2133 0.1553 0.2388 ... ##$ X34: num  0.0334 0.1186 0.2216 0.1636 0.2479 ...
##  $X35: num 0.0465 0.1324 0.2299 0.172 0.2571 ... ##$ X36: num  0.0595 0.1461 0.2383 0.1803 0.2662 ...
##  $X37: num 0.0726 0.1599 0.2466 0.1887 0.2753 ... ##$ X38: num  0.0856 0.1737 0.2549 0.197 0.2844 ...
##  $X39: num 0.0986 0.1874 0.2632 0.2054 0.2935 ... ##$ X40: num  0.112 0.201 0.272 0.214 0.303 ...
##  $X41: num 0.125 0.215 0.28 0.222 0.312 ... ##$ X42: num  0.138 0.229 0.288 0.23 0.321 ...
##  $X43: num 0.151 0.243 0.297 0.239 0.33 ... ##$ X44: num  0.164 0.256 0.305 0.247 0.339 ...
##  $X45: num 0.177 0.27 0.313 0.256 0.348 ... ##$ X46: num  0.19 0.284 0.321 0.264 0.357 ...
##  $X47: num 0.203 0.298 0.33 0.272 0.367 ... ##$ X48: num  0.216 0.311 0.338 0.281 0.376 ...
##  $X49: num 0.229 0.325 0.346 0.289 0.385 ... ##$ X50: num  0.242 0.339 0.355 0.297 0.394 ...
##  $X51: num 0.255 0.353 0.363 0.306 0.403 ... ##$ X52: num  0.268 0.366 0.371 0.314 0.412 ...
##  $X53: num 0.281 0.38 0.38 0.322 0.421 ... ##$ X54: num  0.294 0.394 0.388 0.331 0.43 ...
##  $X55: num 0.307 0.408 0.396 0.339 0.439 ... ##$ X56: num  0.32 0.421 0.405 0.347 0.449 ...
##  $X57: num 0.333 0.435 0.413 0.356 0.458 ... ##$ X58: num  0.346 0.449 0.421 0.364 0.467 ...
##  $X59: num 0.36 0.463 0.43 0.372 0.476 ... ##$ X60: num  0.373 0.477 0.438 0.381 0.485 ...
##  $X61: num 0.386 0.49 0.446 0.389 0.494 ... ##$ X62: num  0.399 0.504 0.455 0.397 0.503 ...
##  $X63: num 0.412 0.518 0.463 0.406 0.512 ... ##$ X64: num  0.425 0.532 0.471 0.414 0.522 ...
##  $X65: num 0.438 0.545 0.48 0.423 0.531 ... ##$ X66: num  0.451 0.559 0.488 0.431 0.54 ...
##  $X67: num 0.464 0.573 0.496 0.439 0.549 ... ##$ X68: num  0.477 0.587 0.505 0.448 0.558 ...
##  $X69: num 0.49 0.6 0.513 0.456 0.567 ... ##$ X70: num  0.503 0.614 0.521 0.464 0.576 ...
##  $X71: num 0.516 0.628 0.53 0.473 0.585 ... ##$ X72: num  0.529 0.642 0.538 0.481 0.595 ...
##  $X73: num 0.542 0.655 0.546 0.489 0.604 ... ##$ X74: num  0.555 0.669 0.555 0.498 0.613 ...
##  $X75: num 0.568 0.683 0.563 0.506 0.622 ... ##$ X76: num  0.581 0.697 0.571 0.514 0.631 ...

Okay, so that looks a little monstrous. What we did in the first argument was tell sapply() which values we’d like to use in some function. We chose each integer ranging from 15 to 90–which, if you do the math, is 76 values. We then told sapply() to plug those values into a custom function, which we defined as function(i) {post$b_negemot:sex + post$b_negemot:sex:age * i}. In our custom function, i was a placeholder for each of those 76 integers. But remember that post has 4,000 rows, each one corresponding to one of the 4,000 posterior iterations. Thus, for each of our 76 i-values, we got 4,000 results. After all that sapply() returned a matrix. Since we like to work within the tidyverse and use ggplot2, we just went ahead and put those results in a tibble.

With our sapply() output in hand, all we need to do is a little more indexing and summarizing and we’re ready to plot. The result is our very own version of Figure 9.7.

sapply(15:90, function(i) post$b_negemot:sex + post$b_negemot:sex:age * i) %>%
data.frame() %>%
set_names(15:90) %>%
pivot_longer(everything()) %>%
mutate(age = as.double(name)) %>%
group_by(age) %>%
mean_qi(value) %>%

ggplot(aes(x = age)) +
geom_hline(yintercept = 0, color = "grey75") +
geom_vline(xintercept = 38.114, color = "grey75") +
geom_ribbon(aes(ymin = .lower, ymax = .upper),
alpha = 1/2) +
geom_line(aes(y = value),
size = 1) +
coord_cartesian(xlim = c(20, 85),
ylim = c(-.25, .75)) +
labs(x = expression(paste("Age, ", italic(Z))),
y = "Conditional Two-way Interaction Between\nNegative Emotions and Sex") +
theme_xkcd()

Or for kicks and giggles, another way to get a clearer sense of how our data informed the shape of the plot, here we replace our geom_ribbon() + geom_line() code with geom_pointrange().

sapply(15:90, function(i) {
post$b_negemot:sex + post$b_negemot:sex:age * i
}) %>%
data.frame() %>%
set_names(15:90) %>%
pivot_longer(everything()) %>%
mutate(age = as.double(name)) %>%
group_by(age) %>%
mean_qi(value) %>%

ggplot(aes(x = age)) +
geom_hline(yintercept = 0, color = "grey75") +
geom_vline(xintercept = 38.114, color = "grey75") +
geom_pointrange(aes(y = value, ymin = .lower, ymax = .upper),
shape = 16, size = 1/3) +
coord_cartesian(xlim = c(20, 85),
ylim = c(-.25, .75)) +
labs(x = expression("Age, "*italic(Z)),
y = "Conditional Two-way Interaction Between\nNegative Emotions and Sex") +
theme_xkcd()

Although I probably wouldn’t try to use a plot like this in a manuscript, I hope it makes clear how the way we’ve been implementing the JN technique is just the pick-a-point approach in bulk. No magic, here.

For all you tidyverse fanatics out there, don’t worry. There are more tidyverse-centric ways to get the plot values than with sapply(). We’ll get to them soon enough. It’s advantageous to have good old base R sapply() up your sleeve, too. And new R users, it’s helpful to know that sapply() is one part of the apply() family of base R functions, which you might learn more about here or here or here.

Now the conditional effect of $$X$$ on $$Y$$ given $$W$$ and $$Z$$ is

$\theta_{X \rightarrow Y} = b_1 + b_4 W + b_5 Z + b_7 WZ.$

In the terms of model9.13 where sex = $$W$$ and age = $$Z$$, we can restate that as

$\theta_{\text{negemot} \rightarrow \text{govact}} = b_\text{negemot} + b_{\text{negemot} \times \text{sex}} \text{sex} + b_{\text{negemot} \times \text{age}} \text{age} + b_{\text{negemot} \times \text{sex} \times \text{age}} \text{sex} \times \text{age}.$

Following Hayes at the bottom of page 341, here is the difference in the effect of negative emotions between men and women among 30 year olds, $$\theta_{XW \rightarrow Y} | (Z = 30)$$.

post %>%
# algebra
mutate(men   = b_negemot + b_negemot:sex * 1 + b_negemot:age * 30 + b_negemot:sex:age * 1 * 30,
women = b_negemot + b_negemot:sex * 0 + b_negemot:age * 30 + b_negemot:sex:age * 0 * 30) %>%
# more algebra
mutate(men - women = men - women) %>%
pivot_longer(men:men - women) %>%
# this just orders the output
mutate(name = factor(name, levels = c("men", "women", "men - women"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name        value .lower .upper .width .point .interval
##   <fct>       <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 men         0.371  0.253  0.492   0.95 mean   qi
## 2 women       0.299  0.195  0.405   0.95 mean   qi
## 3 men - women 0.072 -0.087  0.235   0.95 mean   qi

In contrast, here is $$\theta_{XW \rightarrow Y} | (Z = 50)$$.

post %>%
mutate(men   = b_negemot + b_negemot:sex * 1 + b_negemot:age * 50 + b_negemot:sex:age * 1 * 50,
women = b_negemot + b_negemot:sex * 0 + b_negemot:age * 50 + b_negemot:sex:age * 0 * 50) %>%
mutate(men - women = men - women) %>%
pivot_longer(men:men - women) %>%
mutate(name = factor(name, levels = c("men", "women", "men - women"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name        value .lower .upper .width .point .interval
##   <fct>       <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 men         0.522  0.454  0.592   0.95 mean   qi
## 2 women       0.319  0.243  0.393   0.95 mean   qi
## 3 men - women 0.204  0.105  0.301   0.95 mean   qi

Finally, here we compute $$\theta_{XW \rightarrow Y} | (Z = 70)$$.

post %>%
transmute(men   = b_negemot + b_negemot:sex * 1 + b_negemot:age * 70 + b_negemot:sex:age * 1 * 70,
women = b_negemot + b_negemot:sex * 0 + b_negemot:age * 70 + b_negemot:sex:age * 0 * 70) %>%
mutate(men - women = men - women) %>%
pivot_longer(men:men - women) %>%
mutate(name = factor(name, levels = c("men", "women", "men - women"))) %>%
group_by(name) %>%
mean_qi(value) %>%
mutate_if(is.double, round, digits = 3)
## # A tibble: 3 x 7
##   name        value .lower .upper .width .point .interval
##   <fct>       <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
## 1 men         0.674  0.581  0.766   0.95 mean   qi
## 2 women       0.338  0.202  0.475   0.95 mean   qi
## 3 men - women 0.335  0.171  0.494   0.95 mean   qi

## 9.5 Comparing conditional effects

In this section, [Hayes discussed] a test of the difference between two conditional effects in a moderation model. The method [Hayes described] is called a “slope difference test” by Dawson , who offers some guidelines on how to conduct this test for the moderated moderation model. (p. 342)

As usual, we will follow along, but with alternative methods to the statistical testing paradigm. We will steadfastly continue summarizing and plotting the posterior distribution from various vantage points.

### 9.5.1 Comparing conditional effects in the additive multiple moderation model.

When we’re interested in estimating the difference in the conditional effect of $$X$$ on $$Y$$ (i.e., $$\theta_{X \rightarrow Y}$$) for $$W = w_1$$ and $$Z = z_1$$ versus when $$W = w_2$$ and $$Z = z_2$$, that follows the formula

\begin{align*} \Delta \theta_{X \rightarrow Y} & = (b_1 + b_4 w_1 + b_5 z_1) - (b_1 + b_4 w_2 + b_5 z_2) \\ & = b_4 (w_1 - w_2) + b_5 (z_1 - z_2). \end{align*}

As in other cases, we don’t have to worry about special considerations for computing the standard errors for out Bayesian models. All we need to do is follow the simple algebraic manipulations of the posterior distribution. Because of the correlation structure within the parameters, the uncertainty in the conditional distribution will work itself out.

### 9.5.2 Comparing conditional effects in the moderated moderation model.

We’ll update our formula from last section to

\begin{align*} \Delta \theta_{X \rightarrow Y} & = (b_1 + b_4 w_1 + b_5 z_1 + b_7 w_1 z_1) - (b_1 + b_4 w_2 + b_5 z_2 + b_7 w_2 z_2) \\ & = b_4 (w_1 - w_2) + b_5 (z_1 - z_2) + b_7 (w_1 z_1 - w_2 z_2). \end{align*}

### 9.5.3 Implementation in PROCESS brms.

Since we don’t have the contrast feature automated like in PROCESS, we’ll have to carefully follow the equations above to specify the values properly in R. Here we’ll use the equation in the first line,

$\Delta \theta_{X \rightarrow Y} = (b_1 + b_4 w_1 + b_5 z_1 + b_7 w_1 z_1) - (b_1 + b_4 w_2 + b_5 z_2 + b_7 w_2 z_2).$

w1 <- 1
z1 <- 30
w2 <- 0
z2 <- 50

post %>%
mutate(30-year-old men   = b_negemot + b_negemot:sex * w1 + b_negemot:age * z1 + b_negemot:sex:age * w1 * z1,
50-year-old women = b_negemot + b_negemot:sex * w2 + b_negemot:age * z2 + b_negemot:sex:age * w2 * z2) %>%
mutate(contrast = 30-year-old men - 50-year-old women) %>%
pivot_longer(30-year-old men:contrast) %>%
group_by(name) %>%
summarize(mean = mean(value),
sd   = sd(value),
ll   = quantile(value, .025),
ul   = quantile(value, .975)) %>%
mutate_if(is.double, round, digits = 4)
## # A tibble: 3 x 5
##   name                mean     sd      ll    ul
##   <chr>              <dbl>  <dbl>   <dbl> <dbl>
## 1 30-year-old men   0.371  0.0607  0.253  0.492
## 2 50-year-old women 0.319  0.0375  0.243  0.393
## 3 contrast          0.0523 0.07   -0.0832 0.188

Notice how our posterior $$SD$$ corresponded nicely to the standard error in Hayes’s contrast test. And we didn’t even have to worry about using the frightening Formula 9.21 on page 345. That information was contained in the posterior distribution all along. All we had to do was combine the parameter iterations with a little algebra and then summarize().

For good measure, we’ll compute using the equation in the second line,

$\Delta \theta_{X \rightarrow Y} = b_4 (w_1 - w_2) + b_5 (z_1 - z_2) + b_7 (w_1 z_1 - w_2 z_2).$

post %>%
mutate(contrast = b_negemot:sex * (w1 - w2) + b_negemot:age * (z1 - z2) + b_negemot:sex:age * (w1 * z1 - w2 * z2)) %>%
pivot_longer(contrast) %>%
group_by(name) %>%
summarize(mean = mean(value),
sd   = sd(value),
ll   = quantile(value, .025),
ul   = quantile(value, .975)) %>%
mutate_if(is.double, round, digits = 4)
## # A tibble: 1 x 5
##   name       mean    sd      ll    ul
##   <chr>     <dbl> <dbl>   <dbl> <dbl>
## 1 contrast 0.0523  0.07 -0.0832 0.188

Same results.

## Session info

sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## other attached packages:
##  [1] tidybayes_3.0.0 bayesplot_1.8.0 xkcd_0.0.6      extrafont_0.17  brms_2.15.0     Rcpp_1.0.6
##  [7] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.6     purrr_0.3.4     readr_1.4.0     tidyr_1.1.3
## [13] tibble_3.1.2    ggplot2_3.3.5   tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1         backports_1.2.1      Hmisc_4.4-2          plyr_1.8.6
##   [5] igraph_1.2.6         svUnit_1.0.3         sp_1.4-4             splines_4.0.4
##   [9] crosstalk_1.1.0.1    TH.data_1.0-10       rstantools_2.1.1     inline_0.3.17
##  [13] digest_0.6.27        htmltools_0.5.1.1    rsconnect_0.8.16     fansi_0.4.2
##  [17] checkmate_2.0.0      magrittr_2.0.1       cluster_2.1.0        modelr_0.1.8
##  [21] RcppParallel_5.0.2   matrixStats_0.57.0   extrafontdb_1.0      xts_0.12.1
##  [25] sandwich_3.0-0       prettyunits_1.1.1    jpeg_0.1-8.1         colorspace_2.0-0
##  [29] rvest_1.0.1          ggdist_3.0.0         haven_2.3.1          xfun_0.23
##  [33] callr_3.7.0          crayon_1.4.1         jsonlite_1.7.2       lme4_1.1-25
##  [37] survival_3.2-10      zoo_1.8-8            glue_1.4.2           gtable_0.3.0
##  [41] emmeans_1.5.2-1      V8_3.4.0             distributional_0.2.2 pkgbuild_1.2.0
##  [45] Rttf2pt1_1.3.8       rstan_2.21.2         abind_1.4-5          scales_1.1.1
##  [49] mvtnorm_1.1-1        DBI_1.1.0            miniUI_0.1.1.1       htmlTable_2.1.0
##  [53] xtable_1.8-4         foreign_0.8-81       Formula_1.2-4        stats4_4.0.4
##  [57] StanHeaders_2.21.0-7 DT_0.16              htmlwidgets_1.5.3    httr_1.4.2
##  [61] threejs_0.3.3        arrayhelpers_1.1-0   RColorBrewer_1.1-2   posterior_1.0.1
##  [65] ellipsis_0.3.2       farver_2.1.0         pkgconfig_2.0.3      loo_2.4.1
##  [69] nnet_7.3-15          sass_0.3.1           dbplyr_2.1.1         utf8_1.2.1
##  [73] labeling_0.4.2       tidyselect_1.1.1     rlang_0.4.11         reshape2_1.4.4
##  [77] later_1.2.0          munsell_0.5.0        cellranger_1.1.0     tools_4.0.4
##  [81] cli_3.0.1            generics_0.1.0       broom_0.7.6          ggridges_0.5.3
##  [85] evaluate_0.14        fastmap_1.1.0        processx_3.5.2       knitr_1.33
##  [89] fs_1.5.0             nlme_3.1-152         mime_0.10            projpred_2.0.2
##  [93] xml2_1.3.2           compiler_4.0.4       shinythemes_1.1.2    rstudioapi_0.13
##  [97] png_0.1-7            gamm4_0.2-6          curl_4.3             reprex_2.0.0
## [101] statmod_1.4.35       bslib_0.2.4          stringi_1.6.2        highr_0.9
## [105] ps_1.6.0             Brobdingnag_1.2-6    lattice_0.20-41      Matrix_1.3-2
## [109] nloptr_1.2.2.2       markdown_1.1         tensorA_0.36.2       shinyjs_2.0.0
## [113] vctrs_0.3.8          pillar_1.6.1         lifecycle_1.0.0      jquerylib_0.1.4
## [117] bridgesampling_1.0-0 estimability_1.3     data.table_1.14.0    raster_3.4-5
## [121] httpuv_1.6.0         latticeExtra_0.6-29  R6_2.5.0             bookdown_0.22
## [125] promises_1.2.0.1     gridExtra_2.3        codetools_0.2-18     boot_1.3-26
## [129] colourpicker_1.1.0   MASS_7.3-53          gtools_3.8.2         assertthat_0.2.1
## [133] withr_2.4.2          shinystan_2.5.0      multcomp_1.4-16      mgcv_1.8-33
## [137] parallel_4.0.4       hms_1.1.0            rpart_4.1-15         grid_4.0.4
## [141] coda_0.19-4          minqa_1.2.4          rmarkdown_2.8        shiny_1.6.0
## [145] lubridate_1.7.10     base64enc_0.1-3      dygraphs_1.1.1.6

### References

Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. SAGE.
Bürkner, P.-C. (2021b). Estimating multivariate models with brms. https://CRAN.R-project.org/package=brms/vignettes/brms_multivariate.html
Bürkner, P.-C. (2021c). Handle missing values with brms. https://CRAN.R-project.org/package=brms/vignettes/brms_missings.html
Chang, W. (2014). Extrafont: Tools for using fonts. https://cran.r-project.org/package=extrafont/
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd edition). Lawrence Erlbaum Associates. https://doi.org/10.4324/9780203774441
Dawson, J. F. (2014). Moderation in management research: What, why, when, and how. Journal of Business and Psychology, 29(1), 1–19. https://doi.org/10.1007/s10869-013-9308-7
Dawson, J. F., & Richter, A. W. (2006). Probing three-way interactions in moderated multiple regression: Development and application of a slope difference test. Journal of Applied Psychology, 91(4), 917–926. https://doi.org/10.1037/0021-9010.91.4.917
Enders, C. K. (2010). Applied missing data analysis. Guilford press. http://www.appliedmissingdata.com/
Gabry, J., & Modrák, M. (2020). Visual MCMC diagnostics using the bayesplot package. https://CRAN.R-project.org/package=bayesplot/vignettes/visual-mcmc-diagnostics.html
Hayes, Andrew F. (2018). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (Second edition). The Guilford Press. https://www.guilford.com/books/Introduction-to-Mediation-Moderation-and-Conditional-Process-Analysis/Andrew-Hayes/9781462534654
Kurz, A. S. (2021). Statistical rethinking with brms, ggplot2, and the tidyverse: Second Edition (version 0.2.0). https://bookdown.org/content/4857/
Kurz, A. S. (2020b). Statistical rethinking with brms, ggplot2, and the tidyverse (version 1.2.0). https://doi.org/10.5281/zenodo.3693202
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan (Second Edition). CRC Press. https://xcelab.net/rm/statistical-rethinking/
McElreath, R. (2015). Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press. https://xcelab.net/rm/statistical-rethinking/
Stan Development Team. (2021). Stan user’s guide, Version 2.26. https://mc-stan.org/docs/2_26/stan-users-guide/index.html
Torres-Manzanera, E. (n.d.). Xkcd: Plotting XKCD graphs. Journal of Statistical Software, 12. https://cran.r-project.org/package=xkcd/vignettes/xkcd-intro.pdf
Torres-Manzanera, E. (2018). Xkcd: Plotting ggplot2 graphics in an XKCD style. https://CRAN.R-project.org/package=xkcd
van Buuren, S. (2018). Flexible imputation of missing data (Second Edition). CRC Press. https://stefvanbuuren.name/fimd/
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P.-C. (2019). Rank-normalization, folding, and localization: An improved $$\widehat R$$ for assessing convergence of MCMC. arXiv Preprint arXiv:1903.08008. https://arxiv.org/abs/1903.08008?