# 3 The Simple Mediation Model

Hayes closed the opening with:

Whereas answering questions about

whenorfor whomis the domain of moderation analysis, questions that ask abouthowpertain tomediation, the focus of this and the next three chapters. In this chapter, [we’ll explore] thesimple mediation modeland illustrate using~~OLS regression-based~~Bayesian path analysis how the effect of an antecedent variable \(X\) on some final consequent \(Y\) can be partitioned into two paths of influence,directandindirect. (Andrew F. Hayes, 2018, p. 78,emphasisin the original)

## 3.1 The simple mediation model

Mediation analysis is a statistical method used to evaluate evidence from studies designed to test hypotheses about how some causal antecedent variable \(X\) transmits its effect on a consequent variable \(Y\).

When thinking about whether a phenomenon or theory you are studying could be conceptualized as a mediation process, it is important to keep in mind that mediation is ultimately a causal explanation. It is assumed that the relationships in the system are causal, and, importantly, that \(M\) is causally located

between\(X\) and \(Y\). It must be assumed, if not also empirically substantiated, that \(X\) causes \(M\), which in turn causes \(Y\). \(M\) cannot possibly carry \(X\)’s effect on \(Y\) if \(M\) is not located causally between \(X\) and \(Y\). (pp. 78–81,emphasisin the original)

## 3.2 Estimation of the direct, indirect, and total effects of \(X\)

Given the simple three-term mediation model, the statistical model is expressed in the two equations

\[\begin{align*} M & = i_M + a X + e_M \\ Y & = i_Y + c' X + b M + e_Y. \end{align*}\]

When using OLS software, as Hayes promotes throughout the text, these equations are estimated sequentially. However, the **brms** package has multivariate capabilities. As such, our results will be from a Bayesian multivariate model that simultaneously computes both equations at once. They are both part of a joint model. And when we consider more advanced models later in the text, our multivariate models will fit even more than two equations at once. None of this is a problem for **brms**.

## 3.3 Example with dichotomous \(X\): The influence of presumed media influence

Here we load a couple necessary packages, load the data, and take a peek.

```
library(tidyverse)
<- read_csv("data/pmi/pmi.csv")
pmi
glimpse(pmi)
```

```
## Rows: 123
## Columns: 6
## $ cond <dbl> 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0…
## $ pmi <dbl> 7.0, 6.0, 5.5, 6.5, 6.0, 5.5, 3.5, 6.0, 4.5, 7.0, 1.0, 6.0, 5.0, 7.0, 7.0, 7.0, 4.5, 3.5, 7…
## $ import <dbl> 6, 1, 6, 6, 5, 1, 1, 6, 6, 6, 3, 3, 4, 7, 1, 6, 3, 3, 2, 4, 4, 6, 7, 4, 5, 4, 6, 5, 5, 7, 4…
## $ reaction <dbl> 5.25, 1.25, 5.00, 2.75, 2.50, 1.25, 1.50, 4.75, 4.25, 6.25, 1.25, 2.75, 3.75, 5.00, 4.00, 5…
## $ gender <dbl> 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0…
## $ age <dbl> 51.0, 40.0, 26.0, 21.0, 27.0, 25.0, 23.0, 25.0, 22.0, 24.0, 22.0, 21.0, 23.0, 21.0, 22.0, 2…
```

You can get the male/female split like so.

```
%>%
pmi count(gender)
```

```
## # A tibble: 2 x 2
## gender n
## <dbl> <int>
## 1 0 80
## 2 1 43
```

Here is the split by `condition`

.

```
%>%
pmi count(cond)
```

```
## # A tibble: 2 x 2
## cond n
## <dbl> <int>
## 1 0 65
## 2 1 58
```

Here is how to get the ungrouped mean and \(SD\) values for `reaction`

and `pmi`

, as presented in Table 3.1.

```
%>%
pmi pivot_longer(c(reaction, pmi)) %>%
group_by(name) %>%
summarise(mean = mean(value),
sd = sd(value)) %>%
mutate_if(is.double, round, digits = 3)
```

```
## # A tibble: 2 x 3
## name mean sd
## <chr> <dbl> <dbl>
## 1 pmi 5.60 1.32
## 2 reaction 3.48 1.55
```

You might get the mean and \(SD\) values for `reaction`

and `pmi`

grouped by `cond`

like this.

```
%>%
pmi pivot_longer(c(reaction, pmi)) %>%
group_by(cond, name) %>%
summarise(mean = mean(value),
sd = sd(value)) %>%
mutate_if(is.double, round, digits = 3)
```

```
## # A tibble: 4 x 4
## # Groups: cond [2]
## cond name mean sd
## <dbl> <chr> <dbl> <dbl>
## 1 0 pmi 5.38 1.34
## 2 0 reaction 3.25 1.61
## 3 1 pmi 5.85 1.27
## 4 1 reaction 3.75 1.45
```

Let’s load our primary statistical package, **brms**.

`library(brms)`

Before we begin, I should acknowledge that I greatly benefited by this great blog post on path analysis in brms by Jarrett Byrnes. With **brms**, we handle mediation models using the multivariate syntax (Bürkner, 2021b). There are a few ways to do this. Let’s start simple.

If you look at the path model in Figure 3.3, you’ll note that `reaction`

is predicted by `pmi`

and `cond`

. `pmi`

, in turn, is predicted solely by `cond`

. So we have two regression models, which is just the kind of thing the **brms** multivariate syntax is for. So first let’s specify both models, which we’ll nest in `bf()`

functions and save as objects.

```
<- bf(reaction ~ 1 + pmi + cond)
y_model <- bf(pmi ~ 1 + cond) m_model
```

Now we have our `bf()`

objects in hand, we’ll combine them with the `+`

operator within the `brm()`

function. We’ll also specify `set_rescor(FALSE)`

–we’re not interested in adding a residual correlation between `reaction`

and `pmi`

.

```
.1 <-
model3brm(data = pmi,
family = gaussian,
+ m_model + set_rescor(FALSE),
y_model cores = 4,
file = "fits/model03.01")
```

Here are our results.

`print(model3.1)`

```
## Family: MV(gaussian, gaussian)
## Links: mu = identity; sigma = identity
## mu = identity; sigma = identity
## Formula: reaction ~ 1 + pmi + cond
## pmi ~ 1 + cond
## Data: pmi (Number of observations: 123)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## reaction_Intercept 0.52 0.56 -0.58 1.62 1.00 4994 2990
## pmi_Intercept 5.38 0.16 5.05 5.68 1.00 6001 2896
## reaction_pmi 0.51 0.10 0.31 0.70 1.00 5281 2959
## reaction_cond 0.25 0.26 -0.26 0.77 1.00 6398 3090
## pmi_cond 0.48 0.23 0.03 0.94 1.00 6409 2711
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_reaction 1.41 0.09 1.24 1.60 1.00 4435 2908
## sigma_pmi 1.32 0.09 1.16 1.50 1.00 6602 3028
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

If you compare our model summary with the coefficients in the path model in Figure 3.3, you’ll see our coefficients are the same. The **brms** summary also includes intercepts and residual variances, which are typically omitted in path diagrams even though they’re still part of the model.

If you’re getting lost in all the model output, try taking out the constant and error terms.

`fixef(model3.1)[3:5, ] %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## reaction_pmi 0.507 0.100 0.308 0.703
## reaction_cond 0.253 0.261 -0.257 0.772
## pmi_cond 0.481 0.234 0.025 0.942
```

In his Table 3.2, Hayes included the \(R^2\) values. Here are ours.

`bayes_R2(model3.1) %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2reaction 0.210 0.057 0.099 0.319
## R2pmi 0.039 0.031 0.000 0.113
```

It’s worth it to actually plot the \(R^2\) distributions. We’ll take our color palette from the **ggthemes** package (Arnold, 2021).

```
library(ggthemes)
bayes_R2(model3.1, summary = F) %>%
data.frame() %>%
gather() %>%
ggplot(aes(x = value, fill = key)) +
geom_density(color = "transparent", alpha = 2/3) +
scale_fill_colorblind() + # we got this color palette from the ggthemes package
coord_cartesian(xlim = 0:1) +
labs(title = expression(paste("The ", italic("R")^{2}, " distributions for model3.1")),
x = NULL) +
theme_classic()
```

We went through the trouble of plotting the \(R^2\) distributions because it’s useful to understand that they won’t often be symmetric when they’re near their logical boundaries (i.e., 0 and 1). This is where asymmetric Bayesian credible intervals can really shine.

Let’s get down to business and examine the indirect effect, the \(ab\) pathway. In our model,

- \(a\) =
`pmi_cond`

and - \(b\) =
`reaction_pmi`

.

You can isolate them with `fixef()[i, ]`

.

`fixef(model3.1)[5, ]`

```
## Estimate Est.Error Q2.5 Q97.5
## 0.48067590 0.23440092 0.02533352 0.94154760
```

`fixef(model3.1)[3, ]`

```
## Estimate Est.Error Q2.5 Q97.5
## 0.5071028 0.1000656 0.3084225 0.7034410
```

So the naive approach would be to just multiply them.

`fixef(model3.1)[5, ] * fixef(model3.1)[3, ]) %>% round(digits = 3) (`

```
## Estimate Est.Error Q2.5 Q97.5
## 0.244 0.023 0.008 0.662
```

Now, this does get us the correct ‘Estimate’ (i.e., posterior mean). However, the posterior \(SD\) and 95% intervals are off. If you want to do this properly, you need to work with the poster samples themselves. We do that with the `posterior_samples()`

function.

```
<- posterior_samples(model3.1)
post
glimpse(post)
```

```
## Rows: 4,000
## Columns: 8
## $ b_reaction_Intercept <dbl> 0.36559647, 0.59759057, 0.76109870, 0.79578161, -0.51777597, 1.62493958, 0.7192…
## $ b_pmi_Intercept <dbl> 5.312075, 5.366714, 5.313554, 5.351762, 5.505363, 5.583854, 5.617829, 5.117164,…
## $ b_reaction_pmi <dbl> 0.5131299, 0.4871675, 0.4702092, 0.4459453, 0.6967265, 0.2983430, 0.3938866, 0.…
## $ b_reaction_cond <dbl> 0.28194792, 0.13524704, 0.32367895, 0.23891725, 0.35143446, 0.19791049, 0.87464…
## $ b_pmi_cond <dbl> 0.42154009, 0.53567033, 0.96725973, 0.63031509, 0.29318933, 0.09468811, 0.04072…
## $ sigma_reaction <dbl> 1.415145, 1.379133, 1.381677, 1.294263, 1.603548, 1.173935, 1.415440, 1.358086,…
## $ sigma_pmi <dbl> 1.355619, 1.208127, 1.403995, 1.394962, 1.181342, 1.481303, 1.284731, 1.330499,…
## $ lp__ <dbl> -427.1174, -427.2585, -429.8131, -427.9630, -432.0138, -436.5891, -431.7727, -4…
```

Here we compute the indirect effect, `ab`

.

```
<-
post %>%
post mutate(ab = b_pmi_cond * b_reaction_pmi)
```

Now we have `ab`

as a properly computed vector, we can summarize it with the `quantile()`

function.

```
quantile(post$ab, probs = c(.5, .025, .975)) %>%
round(digits = 3)
```

```
## 50% 2.5% 97.5%
## 0.234 0.011 0.519
```

And we can even visualize it as a density.

```
%>%
post ggplot(aes(x = ab)) +
geom_density(color = "transparent",
fill = colorblind_pal()(3)[3]) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(paste("Our indirect effect, the ", italic("ab"), " pathway")),
x = NULL) +
theme_classic()
```

It’s also worth pointing out that as the indirect effect isn’t perfectly symmetric, its mean and median aren’t quite the same.

```
%>%
post summarize(mean = mean(ab),
median = median(ab)) %>%
round(digits = 3)
```

```
## mean median
## 1 0.243 0.234
```

Their magnitudes are similar, but this asymmetry will be a source of contrast to our estimates and the OLS estimates Hayes reported in the text. This is also something to consider when reporting on central tendency. When the indirect effect–or any other parameter, for that matter–is quite asymmetric, you might prefer reporting the median rather than the mean.

On page 90, Hayes computed the *adjusted means* for \(Y\). For both `cond == 1`

and `cond == 0`

, he computed the expected values for `reaction`

when `pmi`

was at its mean. A natural way to do that in **brms** is with `fitted()`

. First we’ll put our input values for `cond`

and `pmi`

in a tibble, which we’ll call `nd`

. Then we’ll feed `nd`

into the `newdata`

argument within the `fitted()`

function.

```
<-
nd tibble(cond = 1:0,
pmi = mean(pmi$pmi))
fitted(model3.1, newdata = nd)
```

```
## , , reaction
##
## Estimate Est.Error Q2.5 Q97.5
## [1,] 3.617955 0.1860720 3.269416 3.991500
## [2,] 3.365036 0.1778951 3.014780 3.729252
##
## , , pmi
##
## Estimate Est.Error Q2.5 Q97.5
## [1,] 5.856277 0.1750840 5.514630 6.196228
## [2,] 5.375601 0.1617578 5.051166 5.683241
```

Because `model3.1`

is a multivariate model, `fitted()`

returned the model-implied summaries for both `reaction`

and `pmi`

. If you just want the adjusted means for `reaction`

, you can use the `resp`

argument within `fitted()`

.

`fitted(model3.1, newdata = nd, resp = "reaction") %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## [1,] 3.618 0.186 3.269 3.991
## [2,] 3.365 0.178 3.015 3.729
```

Note how this is where the two values in the \(Y\) adjusted column in Table 3.1 came from.

However, if we want to reproduce how Hayes computed the total effect (i.e., \(c' + ab\)), we’ll need to work with the posterior draws themselves, `post`

. Recall, we’ve already saved the indirect effect as a vector, `ab`

. The direct effect, \(c'\), is labeled `b_reaction_cond`

within `post`

. To get the total effect, \(c\), all we need to is add those vectors together.

```
<-
post %>%
post mutate(total_effect = b_reaction_cond + ab)
```

Here are the posterior mean with its quantile-based 95% intervals.

```
%>%
post summarize(mean = mean(total_effect),
ll = quantile(total_effect, prob = .025),
ul = quantile(total_effect, prob = .975))
```

```
## mean ll ul
## 1 0.4963361 -0.05526613 1.044788
```

### 3.3.1 ~~Estimation of the model in PROCESS for SPSS and SAS.~~

Nothing new for us, here.

## 3.4 Statistical inference

Our approach will not match up neatly with Hayes’s on this topic.

### 3.4.1 Inference about the total effect of \(X\) on \(Y\).

As we mentioned in Chapter 2, we can indeed focus on rejecting \(H_0\) when using Bayesian statistics. I, however, am not a fan of that approach and I will not be focusing on Bayesian \(p\)-values. But throughout this project, we will make great efforts to express the (un)certainty in our models with various plots of posterior distributions and summary statistics, such as measures of central tendency (e.g., means) and spread (e.g., percentile-based 95% intervals).

So instead of \(t\)- and \(p\)-values for \(c'\), we are going to focus on the distribution. We already gave the mean and 95% intervals, above. Here’s a look at the density.

```
%>%
post ggplot(aes(x = total_effect)) +
geom_density(color = "transparent",
fill = colorblind_pal()(3)[2]) +
scale_y_continuous(NULL, breaks = NULL) +
xlab(expression(paste(italic(c)," (i.e., the total effect)"))) +
theme_classic()
```

### 3.4.2 Inference about the direct effect of \(X\) on \(Y\).

Like in the last section, we will just look at the posterior distribution for the direct effect (i.e., \(c'\), `b_reaction_cond`

).

```
%>%
post ggplot(aes(x = b_reaction_cond)) +
geom_density(color = "transparent",
fill = colorblind_pal()(4)[4]) +
geom_vline(xintercept = 0, color = "white", linetype = 2) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(paste("Yep, 0 is a credible value for ", italic("c"), ".")),
x = NULL) +
theme_classic()
```

If you wanted to quantify what proportion of the density was less than 0, you could execute something like this.

```
%>%
post summarize(proportion_below_zero = mean(b_reaction_cond < 0))
```

```
## proportion_below_zero
## 1 0.1605
```

This is something like a Bayesian \(p\)-value. But of course, you could always just look at the posterior intervals.

`posterior_interval(model3.1)["b_reaction_cond", ]`

```
## 2.5% 97.5%
## -0.2574458 0.7719065
```

### 3.4.3 Inference about the indirect of \(X\) on \(Y\) through \(M\).

The indirect effect quantifies how much two cases that differ by a unit on \(X\) are estimated to differ on \(Y\) as a result of \(X\)’s influence on \(M\), which in turn influences \(Y\). The indirect effect is relevant as to [what extent] \(X\)’s effect on \(Y\) can be said to be transmitted through the mechanism represented by the \(X \rightarrow M \rightarrow Y\) causal chain of events. (p. 95)

#### 3.4.3.1 The normal theory approach.

This is not our approach.

#### 3.4.3.2 Bootstrap confidence interval.

This is not our approach.

However, Markov chain Monte Carlo (MCMC) methods are iterative and share some characteristics with boostrapping. On page 98, Hayes outlined 6 steps for constructing the \(ab\) bootstrap confidence interval. Here are our responses to those steps w/r/t Bayes with MCMC–or in our case HMC (i.e., Hamiltonian Monte Carlo).

If HMC or MCMC, in general, are new to you, you might check out this lecture or the *Stan Reference Manual* if you’re more technically oriented.

Anyway, Hayes’s 6 steps:

##### 3.4.3.2.1 Step 1.

With HMC we do not take random samples of the data themselves. Rather, we take random draws from the posterior distribution. The posterior distribution is the joint probability distribution of our model.

##### 3.4.3.2.2 Step 2.

After we fit our model with the `brm()`

function and save our posterior draws in a data frame (i.e., `post <- posterior_samples(my_model_fit)`

), we then make a new column (a.k.a. vector, variable) that is the product of our coefficients for the \(a\) and \(b\) pathways. In the example above, this looked like `post %>% mutate(ab = b_pmi_cond * b_reaction_pmi)`

. Let’s take a look at those columns.

```
%>%
post select(b_pmi_cond, b_reaction_pmi, ab) %>%
slice(1:10)
```

```
## b_pmi_cond b_reaction_pmi ab
## 1 0.42154009 0.5131299 0.21630483
## 2 0.53567033 0.4871675 0.26096120
## 3 0.96725973 0.4702092 0.45481446
## 4 0.63031509 0.4459453 0.28108606
## 5 0.29318933 0.6967265 0.20427277
## 6 0.09468811 0.2983430 0.02824953
## 7 0.04072477 0.3938866 0.01604094
## 8 0.90712988 0.6157761 0.55858887
## 9 0.34774045 0.3385149 0.11771532
## 10 0.33193994 0.4354419 0.14454057
```

Our data frame, `post`

, has 4,000 rows. Why 4,000? By default, **brms** runs 4 HMC chains. Each chain has 2,000 iterations, 1,000 of which are warmups, which we always discard. As such, there are 1,000 good iterations left in each chain and \(1{,}000 \times 4 = 4{,}000\). We can change these defaults as needed.

Each row in `post`

contains the parameter values based on one of those draws. And again, these are draws from the posterior distribution. They are not draws from the data.

##### 3.4.3.2.3 Step 3.

We don’t refit the model \(k\) times based on the samples from the data. We take a number of draws from the posterior distribution. Hayes likes to take 5,000 samples when he bootstraps. Happily, that number is quite similar to our default 4,000 HMC draws. Whether 5,000, 4,000 or 10,000, these are all large enough numbers that the distributions become fairly stable. With HMC, however, you might want to increase the number of iterations if either measure of effective sample size, ‘Bulk_ESS’ and ‘Tail_ESS’ in the `print()`

output, are substantially smaller than the number of iterations.

##### 3.4.3.2.4 Step 4.

When we use the `quantile()`

function to compute our Bayesian credible intervals, we’ve sorted. Conceptually, we’ve done this.

```
%>%
post select(ab) %>%
arrange(ab) %>%
slice(1:10)
```

```
## ab
## 1 -0.17350242
## 2 -0.14797175
## 3 -0.11911678
## 4 -0.11519001
## 5 -0.10195975
## 6 -0.09700267
## 7 -0.09497020
## 8 -0.09417739
## 9 -0.08930287
## 10 -0.07417596
```

##### 3.4.3.2.5 Step 5.

Yes, this is what we do, too.

```
<- 95
ci
0.5 * (100 - ci)
```

`## [1] 2.5`

##### 3.4.3.2.6 Step 6.

This is also what we do.

```
<- 95
ci
100 - 0.5 * (100 - ci)) (
```

`## [1] 97.5`

Also, notice the headers in the rightmost two columns in our `posterior_summary()`

output:

`posterior_summary(model3.1)`

```
## Estimate Est.Error Q2.5 Q97.5
## b_reaction_Intercept 0.5244362 0.56344986 -0.58443361 1.6188593
## b_pmi_Intercept 5.3756008 0.16175784 5.05116562 5.6832413
## b_reaction_pmi 0.5071028 0.10006562 0.30842255 0.7034410
## b_reaction_cond 0.2529186 0.26139163 -0.25744582 0.7719065
## b_pmi_cond 0.4806759 0.23440092 0.02533352 0.9415476
## sigma_reaction 1.4058526 0.09254048 1.24092563 1.6038985
## sigma_pmi 1.3155903 0.08528014 1.16071637 1.4959238
## lp__ -429.7015030 1.85596449 -434.10783413 -426.9886428
```

Those .025 and .975 quantiles from above are just what **brms** is giving us in our 95% Bayesian credible intervals.

Here’s our version of Figure 3.5.

```
# these will come in handy in the subtitle
<- quantile(post$ab, probs = .025) %>% round(digits = 3)
ll <- quantile(post$ab, probs = .975) %>% round(digits = 3)
ul
%>%
post
ggplot(aes(x = ab)) +
geom_histogram(color = "white", size = .25,
fill = colorblind_pal()(5)[5],
binwidth = .025, boundary = 0) +
geom_vline(xintercept = quantile(post$ab, probs = c(.025, .975)),
linetype = 3, color = colorblind_pal()(6)[6]) +
labs(subtitle = str_c("95% of the posterior draws are between ", ll, " and ", ul),
x = expression(Indirect~effect~(italic(ab))),
y = "Frequency in 4,000 HMC posterior draws") +
theme_classic()
```

Again, as Hayes discussed how to specify different types of intervals in PROCESS on page 102, you can ask for different kinds of intervals in your `print()`

or `summary()`

output with the `probs`

argument, just as you can with `quantile()`

when working directly with the posterior draws.

Hayes discussed setting the seed in PROCESS on page 104. You can do this with the `seed`

argument in the `brm()`

function, too.

#### 3.4.3.3 Alternative “asymmetric” confidence interval approaches.

This section does not quite refer to us. I’m a little surprised Hayes didn’t at least dedicate a paragraph or two on Bayesian estimation. Sure, he mentioned Monte Carlo, but not within the context of Bayes. So it goes. But if you’re interested, you can read about Bayesian intervals for mediation models in Yuan and MacKinnon’s (2009) *Bayesian mediation analysis*. And yes, Hayes is aware of this. He has cited it in his work (e.g., Andrew F. Hayes, 2015).

## 3.5 An example with continuous \(X\): Economic stress among small-business owners

Here’s the `estress`

data.

```
<- read_csv("data/estress/estress.csv")
estress
glimpse(estress)
```

```
## Rows: 262
## Columns: 7
## $ tenure <dbl> 1.67, 0.58, 0.58, 2.00, 5.00, 9.00, 0.00, 2.50, 0.50, 0.58, 9.00, 1.92, 2.00, 1.42, 0.92, 2…
## $ estress <dbl> 6.0, 5.0, 5.5, 3.0, 4.5, 6.0, 5.5, 3.0, 5.5, 6.0, 5.5, 4.0, 3.0, 2.5, 3.5, 6.0, 4.0, 6.0, 3…
## $ affect <dbl> 2.60, 1.00, 2.40, 1.16, 1.00, 1.50, 1.00, 1.16, 1.33, 3.00, 3.00, 2.00, 1.83, 1.16, 1.16, 1…
## $ withdraw <dbl> 3.00, 1.00, 3.66, 4.66, 4.33, 3.00, 1.00, 1.00, 2.00, 4.00, 4.33, 1.00, 5.00, 1.66, 4.00, 1…
## $ sex <dbl> 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0…
## $ age <dbl> 51, 45, 42, 50, 48, 48, 51, 47, 40, 43, 57, 36, 33, 29, 33, 48, 40, 45, 37, 42, 54, 57, 37,…
## $ ese <dbl> 5.33, 6.05, 5.26, 4.35, 4.86, 5.05, 3.66, 6.13, 5.26, 4.00, 2.53, 6.60, 5.20, 5.66, 5.66, 5…
```

The model set up is just like before. There are no complications switching from a binary \(X\) variable to a continuous one.

```
<- bf(withdraw ~ 1 + estress + affect)
y_model <- bf(affect ~ 1 + estress) m_model
```

With our `y_model`

and `m_model`

defined, we’re ready to fit.

```
.2 <-
model3brm(data = estress,
family = gaussian,
+ m_model + set_rescor(FALSE),
y_model cores = 4,
file = "fits/model03.02")
```

Let’s take a look at the results.

`print(model3.2, digits = 3)`

```
## Family: MV(gaussian, gaussian)
## Links: mu = identity; sigma = identity
## mu = identity; sigma = identity
## Formula: withdraw ~ 1 + estress + affect
## affect ~ 1 + estress
## Data: estress (Number of observations: 262)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## withdraw_Intercept 1.453 0.256 0.958 1.961 1.002 6901 2868
## affect_Intercept 0.802 0.145 0.520 1.088 1.000 6500 2875
## withdraw_estress -0.079 0.053 -0.180 0.025 1.001 5371 3164
## withdraw_affect 0.771 0.105 0.571 0.976 1.002 6153 2923
## affect_estress 0.172 0.030 0.115 0.231 1.000 6316 2920
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_withdraw 1.139 0.051 1.046 1.245 1.002 6161 2605
## sigma_affect 0.685 0.030 0.628 0.747 1.000 7455 2883
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

The ‘Rhat,’ ‘Bulk_ESS,’ and ‘Tail_ESS’ values look great. Happily, the values in our summary cohere well with those Hayes reported in Table 3.5. Here are our \(R^2\) values.

`bayes_R2(model3.2)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2withdraw 0.1830694 0.03888225 0.10989488 0.2607779
## R2affect 0.1161616 0.03467365 0.05349846 0.1866342
```

These are also quite similar to those in the text. Here’s our indirect effect.

```
# putting the posterior draws into a data frame
<- posterior_samples(model3.2)
post
# computing the ab coefficient with multiplication
<-
post %>%
post mutate(ab = b_affect_estress*b_withdraw_affect)
# getting the posterior median and 95% intervals with `quantile()`
quantile(post$ab, probs = c(.5, .025, .975)) %>% round(digits = 3)
```

```
## 50% 2.5% 97.5%
## 0.131 0.080 0.195
```

We can visualize its shape, median, and 95% intervals in a density plot.

```
%>%
post ggplot(aes(x = ab)) +
geom_density(color = "transparent",
fill = colorblind_pal()(7)[7]) +
geom_vline(xintercept = quantile(post$ab, probs = c(.025, .5, .975)),
color = "white", linetype = c(2, 1, 2), size = c(.5, .8, .5)) +
scale_x_continuous(breaks = quantile(post$ab, probs = c(.025, .5, .975)),
labels = quantile(post$ab, probs = c(.025, .5, .975)) %>% round(2) %>% as.character()) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(Behold~our~italic("ab")*'!'),
x = NULL) +
theme_classic()
```

Here’s \(c'\), the direct effect of `esterss`

predicting `withdraw`

.

`posterior_summary(model3.2)["b_withdraw_estress", ]`

```
## Estimate Est.Error Q2.5 Q97.5
## -0.07905861 0.05277480 -0.18040443 0.02491451
```

It has wide flapping intervals which do straddle zero. A little addition will give us the direct effect, \(c\).

```
<-
post %>%
post mutate(c = b_withdraw_estress + ab)
quantile(post$c, probs = c(.5, .025, .975)) %>% round(digits = 3)
```

```
## 50% 2.5% 97.5%
## 0.054 -0.054 0.162
```

## Session info

`sessionInfo()`

```
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggthemes_4.2.4 brms_2.15.0 Rcpp_1.0.6 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.6
## [7] purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.2 ggplot2_3.3.5 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] readxl_1.3.1 backports_1.2.1 plyr_1.8.6 igraph_1.2.6 splines_4.0.4
## [6] crosstalk_1.1.0.1 TH.data_1.0-10 rstantools_2.1.1 inline_0.3.17 digest_0.6.27
## [11] htmltools_0.5.1.1 rsconnect_0.8.16 fansi_0.4.2 magrittr_2.0.1 modelr_0.1.8
## [16] RcppParallel_5.0.2 matrixStats_0.57.0 xts_0.12.1 sandwich_3.0-0 prettyunits_1.1.1
## [21] colorspace_2.0-0 rvest_1.0.1 haven_2.3.1 xfun_0.23 callr_3.7.0
## [26] crayon_1.4.1 jsonlite_1.7.2 lme4_1.1-25 survival_3.2-10 zoo_1.8-8
## [31] glue_1.4.2 gtable_0.3.0 emmeans_1.5.2-1 V8_3.4.0 pkgbuild_1.2.0
## [36] rstan_2.21.2 abind_1.4-5 scales_1.1.1 mvtnorm_1.1-1 DBI_1.1.0
## [41] miniUI_0.1.1.1 xtable_1.8-4 stats4_4.0.4 StanHeaders_2.21.0-7 DT_0.16
## [46] htmlwidgets_1.5.3 httr_1.4.2 threejs_0.3.3 ellipsis_0.3.2 pkgconfig_2.0.3
## [51] loo_2.4.1 farver_2.1.0 sass_0.3.1 dbplyr_2.1.1 utf8_1.2.1
## [56] tidyselect_1.1.1 labeling_0.4.2 rlang_0.4.11 reshape2_1.4.4 later_1.2.0
## [61] munsell_0.5.0 cellranger_1.1.0 tools_4.0.4 cli_3.0.1 generics_0.1.0
## [66] broom_0.7.6 ggridges_0.5.3 evaluate_0.14 fastmap_1.1.0 processx_3.5.2
## [71] knitr_1.33 fs_1.5.0 nlme_3.1-152 mime_0.10 projpred_2.0.2
## [76] xml2_1.3.2 compiler_4.0.4 bayesplot_1.8.0 shinythemes_1.1.2 rstudioapi_0.13
## [81] gamm4_0.2-6 curl_4.3 reprex_2.0.0 statmod_1.4.35 bslib_0.2.4
## [86] stringi_1.6.2 highr_0.9 ps_1.6.0 Brobdingnag_1.2-6 lattice_0.20-41
## [91] Matrix_1.3-2 nloptr_1.2.2.2 markdown_1.1 shinyjs_2.0.0 vctrs_0.3.8
## [96] pillar_1.6.1 lifecycle_1.0.0 jquerylib_0.1.4 bridgesampling_1.0-0 estimability_1.3
## [101] httpuv_1.6.0 R6_2.5.0 bookdown_0.22 promises_1.2.0.1 gridExtra_2.3
## [106] codetools_0.2-18 boot_1.3-26 colourpicker_1.1.0 MASS_7.3-53 gtools_3.8.2
## [111] assertthat_0.2.1 withr_2.4.2 shinystan_2.5.0 multcomp_1.4-16 mgcv_1.8-33
## [116] parallel_4.0.4 hms_1.1.0 grid_4.0.4 coda_0.19-4 minqa_1.2.4
## [121] rmarkdown_2.8 shiny_1.6.0 lubridate_1.7.10 base64enc_0.1-3 dygraphs_1.1.1.6
```

### References

*ggthemes: Extra themes, scales and geoms for ’ggplot2’*. https://CRAN.R-project.org/package=ggthemes

*Estimating multivariate models with brms*. https://CRAN.R-project.org/package=brms/vignettes/brms_multivariate.html

*Multivariate Behavioral Research*,

*50*(1), 1–22. https://doi.org/10.1080/00273171.2014.962683

*Introduction to mediation, moderation, and conditional process analysis: A regression-based approach*(Second edition). The Guilford Press. https://www.guilford.com/books/Introduction-to-Mediation-Moderation-and-Conditional-Process-Analysis/Andrew-Hayes/9781462534654

*Psychological Methods*,

*14*(4), 301–322. https://doi.org/10.1037/a0016972