# 2 Fundamentals of Linear Regression

Linear regression is the foundation of most of the methods [Hayes described in his] book, so a solid understanding of the fundamentals of linear regression is essential. I assume that most readers have been exposed to linear regression in some form before discovering this book, and so some of the material will be review. Even so, I encourage everyone to read this chapter. (Andrew F. Hayes, 2018, p. 30)

Since we’re adding Bayes and the **tidyverse** into the mix, walking through this chapter will be double important, for us.

## 2.1 Correlation and prediction

Here we load a couple necessary packages, load the data, and take a peek.

```
library(tidyverse)
<- read_csv("data/glbwarm/glbwarm.csv")
glbwarm
glimpse(glbwarm)
```

```
## Rows: 815
## Columns: 7
## $ govact <dbl> 3.6, 5.0, 6.6, 1.0, 4.0, 7.0, 6.8, 5.6, 6.0, 2.6, 1.4, 5.6, 7.0, 3.8, 3.4, 4.2, 1.0, 2.6, 5…
## $ posemot <dbl> 3.67, 2.00, 2.33, 5.00, 2.33, 1.00, 2.33, 4.00, 5.00, 5.00, 1.00, 4.00, 1.00, 5.67, 3.00, 1…
## $ negemot <dbl> 4.67, 2.33, 3.67, 5.00, 1.67, 6.00, 4.00, 5.33, 6.00, 2.00, 1.00, 4.00, 5.00, 4.67, 2.00, 1…
## $ ideology <dbl> 6, 2, 1, 1, 4, 3, 4, 5, 4, 7, 6, 4, 2, 4, 5, 2, 6, 4, 2, 4, 4, 2, 6, 4, 4, 3, 4, 5, 4, 5, 4…
## $ age <dbl> 61, 55, 85, 59, 22, 34, 47, 65, 50, 60, 71, 60, 71, 59, 32, 36, 69, 70, 41, 48, 38, 63, 71,…
## $ sex <dbl> 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0…
## $ partyid <dbl> 2, 1, 1, 1, 1, 2, 1, 1, 2, 3, 2, 1, 1, 1, 1, 1, 2, 3, 1, 3, 2, 1, 3, 2, 1, 1, 1, 3, 1, 1, 1…
```

If you are new to **tidyverse**-style syntax, possibly the oddest component is the pipe (i.e., `%>%`

). I’m not going to explain the `%>%`

in this project, but you might learn more about in this brief clip, starting around minute 21:25 in this talk by Wickham, or in Section 5.6.1 from Grolemund & Wickham (2017). Really, all of Chapter 5 of *R4DS* is great for new **R** and new **tidyverse** users, and Chapter 3 is a nice introduction to plotting with **ggplot2**.

Here is our version of Figure 2.1.

```
%>%
glbwarm group_by(negemot, govact) %>%
count() %>%
ggplot(aes(x = negemot, y = govact, size = n)) +
geom_point(show.legend = F) +
labs(x = expression(paste("NEGEMOT: Negative emotions about climate change (", italic("X"), ")")),
y = expression(paste("GOVACT: Support for governmentaction (", italic("Y"), ")"))) +
theme_bw()
```

There are other ways to handle the overplotting issue, such as jittering.

```
%>%
glbwarm ggplot(aes(x = negemot, y = govact)) +
geom_jitter(height = .05, width = .05,
alpha = 1/2, size = 1/3) +
labs(x = expression(paste("NEGEMOT: Negative emotions about climate change (", italic("X"), ")")),
y = expression(paste("GOVACT: Support for governmentaction (", italic("Y"), ")"))) +
theme_bw()
```

The `cor()`

function is a simple way to compute a Pearson’s correlation coefficient.

`cor(glbwarm$negemot, glbwarm$govact)`

`## [1] 0.5777458`

If you want more plentiful output, the `cor.test()`

function returns a \(t\)-value, the degrees of freedom, the corresponding \(p\)-value and the 95% confidence intervals, in addition to the Pearson’s correlation coefficient.

`cor.test(glbwarm$negemot, glbwarm$govact)`

```
##
## Pearson's product-moment correlation
##
## data: glbwarm$negemot and glbwarm$govact
## t = 20.183, df = 813, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5301050 0.6217505
## sample estimates:
## cor
## 0.5777458
```

To get the Bayesian version, we’ll open our focal statistical package, Bürkner’s **brms**. I should briefly note that you could also do many of these analyses with other packages, such as **blavaan** (Merkle et al., 2020; Merkle & Rosseel, 2018) or **rstanarm** (Brilleman et al., 2018; Gabry & Goodrich, 2020). I just prefer **brms**.

`library(brms)`

We’ll start simple and just use the default priors and settings, but with the addition of parallel sampling via `cores = 4`

and telling **brms** to save our output as an extermal file with the `file`

argument.

```
.1 <-
model2brm(data = glbwarm,
family = gaussian,
mvbind(negemot, govact) ~ 1,
cores = 4,
file = "fits/model02.01")
```

We can examine a summary of the output with the `print()`

function.

`print(model2.1)`

```
## Family: MV(gaussian, gaussian)
## Links: mu = identity; sigma = identity
## mu = identity; sigma = identity
## Formula: negemot ~ 1
## govact ~ 1
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## negemot_Intercept 3.56 0.05 3.46 3.67 1.00 3847 2915
## govact_Intercept 4.59 0.05 4.49 4.68 1.00 3860 3202
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_negemot 1.53 0.04 1.46 1.60 1.00 3646 3217
## sigma_govact 1.36 0.03 1.30 1.43 1.00 3536 3062
##
## Residual Correlations:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## rescor(negemot,govact) 0.58 0.02 0.53 0.62 1.00 3606 3334
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

In regression models, we typically use predictor variables to explain variation in our criterion variables. It’s pretty much never the case that our predictors explain all the variation. The variation that’s left over is often called residual variation, residual variance, residuals, error, or even \(\epsilon\). Throughout the text, Hayes typically referred to it as \(e\).

More formally, when we use likelihood-based estimators, such as maximum likelihood (popular with multilevel modeling and structural equation modeling) or Bayesian estimators, we express the models for our criterion variables in terms of likelihood functions. Probably the most common likelihood function, and the one consistent with the models in Hayes’s text, is the Gaussian likelihood. With that likelihood we say our criterion variable \(y_i\) is normally distributed with a mean \(\mu\) and standard deviation \(\sigma\). We might express that formally as

\[ y_i \sim \operatorname{Normal} (\mu, \sigma), \]

where \(\sim\) stands for “is distributed as” and \(i\) indexes the \(i\)th row in the data. When we add predictors to the model, they are typically used to model the mean \(\mu\). Thus, in the case there we have a sole predictor \(x_i\) which varies across participants, we’d expand our model formula to

\[\begin{align*} y_i & \sim \operatorname{Normal} (\mu_i, \sigma) \\ \mu_i & = \beta_0 + \beta_1 x_i, \end{align*}\]

where \(\beta_0\) is the intercept and \(\beta_1\) is the slope for predictor \(x\), which varies across cases. In this formulation, \(\sigma\) is the standard deviation after accounting for the systemic variation explained by \(x_i\). That is, it’s the residual variance (i.e., \(\epsilon\)), but in a standard-deviation metric. *Why in a standard deviation metric?*, you ask. There are technical reasons why **brms** expresses it as a standard deviation which I’m not going to go into, here. Just beware that whereas many frequentist software packages express the residual variation in a variance metric, it’s expressed in a standard-deviation metric in **brms**. Just go with it and move on.

Within the **brms** framework, \(\sigma\) of the Gaussian likelihood is considered a “family-specific” parameter. As it turns out, there are many other fine likelihood functions in addition to the Gaussian and not all of them have a \(\sigma\) parameter. For example, there is no \(\sigma\) for the Poisson^{1} distribution, which is popular likelihood function for count variables. Because Bürkner made the **brms** package capable of a variety of likelihood functions, it behooved him to give this section of the output a generic name.

When you have a regression model with multiple Gaussian criterion variables, those variables will typically have some degree of covariation. It’s often termed residual covariance, particularly within the structural equation modeling paradigm.^{2} But when you have an intercept-only regression model with multiple variables, that residual covariance is just a covariance. And when you express those variation parameters in terms of standard deviations \(\sigma\), their covariance is expressed as a correlation \(\rho\). Since our multivariate model is of two variables, we have one \(\rho\) parameter for the \(\sigma\)s, `rescor(negemot,govact)`

, which is our Bayesian analogue to the Pearson’s correlation.

To learn more about the multivariate syntax in **brms**, check out Bürkner’s (2021b) vignette, *Estimating multivariate models with brms*, or execute `vignette("brms_multivariate")`

. Or just hold your horses until we get into mediation. All of our mediation models will use one version of the multivariate syntax or another.

But to clarify the output, above:

- ‘Estimate’ = the posterior mean, analogous to the frequentist point estimate,
- ‘Est.Error’ = the posterior \(SD\), analogous to the frequentist standard error,
- ‘l-95% CI’ = the lower-level of the percentile-based 95% Bayesian credible interval, and
- ‘u-95% CI’ = the upper-level of the same.

## 2.2 The simple linear regression model

Here is how one might get the simple OLS coefficients in base **R** with the `lm()`

function.

`.2 <- lm(data = glbwarm, govact ~ 1 + negemot)) (model2`

```
##
## Call:
## lm(formula = govact ~ 1 + negemot, data = glbwarm)
##
## Coefficients:
## (Intercept) negemot
## 2.7573 0.5142
```

For more detailed output, put the model object `model2.2`

into the `summary()`

function.

`summary(model2.2)`

```
##
## Call:
## lm(formula = govact ~ 1 + negemot, data = glbwarm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.3285 -0.6731 0.1018 0.7554 3.2142
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.75732 0.09866 27.95 <2e-16 ***
## negemot 0.51424 0.02548 20.18 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.111 on 813 degrees of freedom
## Multiple R-squared: 0.3338, Adjusted R-squared: 0.333
## F-statistic: 407.3 on 1 and 813 DF, p-value: < 2.2e-16
```

Here’s the Bayesian model in **brms**.

```
.3 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + negemot,
govact cores = 4,
file = "fits/model02.03")
```

There are several ways to get a **brms** model summary. A go-to is with the `summary()`

function, just like we did with our OLS model.

`summary(model2.3)`

```
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: govact ~ 1 + negemot
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 2.75 0.10 2.56 2.95 1.00 4151 2948
## negemot 0.51 0.03 0.47 0.56 1.00 4033 3164
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.11 0.03 1.06 1.17 1.00 4139 2902
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

The `print()`

function works very much the same way. To get a more focused look, you can use the `posterior_summary()`

function:

`posterior_summary(model2.3)`

```
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 2.7540133 0.09928784 2.563757 2.9547102
## b_negemot 0.5149687 0.02542118 0.465310 0.5645472
## sigma 1.1127290 0.02734424 1.060580 1.1670321
## lp__ -1245.9491225 1.22875310 -1249.141482 -1244.5570274
```

That also yields the log posterior, `lp__`

, which you can learn more about in this section of the (2020) vignette by the Stan Development Team, *RStan: the R interface to Stan* or this brief blog post by Xulong Wang. We won’t focus on the `lp__`

directly in this project. But its influence will be lurking in the shadows.

But anyways, the `Q2.5`

and `Q97.5`

, are the lower- and upper-levels of the 95% credible intervals. The `Q`

prefix stands for quantile (see this thread). In this case, these are a renamed version of the `l-95% CI`

and `u-95% CI`

columns from our `summary()`

output.

To make a quick plot of the regression line, one can use the convenient `brms::conditional_effects()`

function.

`conditional_effects(model2.3)`

If you want to customize that output, you might nest it in `plot()`

.

```
plot(conditional_effects(model2.3),
points = T,
point_args = c(height = .05, width = .05,
alpha = 1/2, size = 1/3))
```

It’s also useful to be able to work with the output of a **brms** model directly. For our first step, we’ll put our posterior draws into a data frame with the `posterior_samples()`

function.

```
<- posterior_samples(model2.3)
post
head(post)
```

```
## b_Intercept b_negemot sigma lp__
## 1 2.767022 0.5145552 1.094959 -1244.656
## 2 2.867891 0.4886331 1.087075 -1245.484
## 3 2.623920 0.5498508 1.126685 -1245.598
## 4 2.499840 0.5814869 1.082886 -1248.757
## 5 2.609229 0.5429867 1.117016 -1245.807
## 6 2.889080 0.4979026 1.085575 -1246.963
```

Next, we’ll use the `fitted()`

function to simulate model-implied summaries for the expected `govact`

value, given particular predictor values. Our first model only has `negemot`

as a predictor, and we’ll ask for the expected `govact`

values for `negemot`

ranging from 0 to 7.

```
<- tibble(negemot = seq(from = 0, to = 7, length.out = 30))
nd
<-
f fitted(model2.3,
newdata = nd) %>%
data.frame() %>%
bind_cols(nd)
head(f)
```

```
## Estimate Est.Error Q2.5 Q97.5 negemot
## 1 2.754013 0.09928784 2.563757 2.954710 0.0000000
## 2 2.878316 0.09367707 2.698105 3.067791 0.2413793
## 3 3.002619 0.08813636 2.833752 3.181095 0.4827586
## 4 3.126922 0.08267978 2.968901 3.294023 0.7241379
## 5 3.251224 0.07732515 3.103658 3.407770 0.9655172
## 6 3.375527 0.07209519 3.236972 3.520894 1.2068966
```

The first two columns should look familiar to the output from `summary(model2.3)`

, above. The next two columns, `Q2.5`

and `Q97.5`

, are the lower- and upper-levels of the 95% credible intervals, like we got from `posterior_samples()`

. The final column is the result of the `bind_cols(nd)`

code.

Here’s our bespoke version of Figure 2.4.

```
%>%
glbwarm group_by(negemot, govact) %>%
count() %>%
ggplot(aes(x = negemot)) +
geom_point(aes(y = govact, size = n),
show.legend = F) +
geom_ribbon(data = f,
aes(ymin = Q2.5, ymax = Q97.5),
fill = "grey75", alpha = 3/4) +
geom_line(data = f,
aes(y = Estimate)) +
annotate("text", x = 2.2, y = 7.5, label = "Cases with positive residuals", color = "red3") +
annotate("text", x = 4.75, y = .8, label = "Cases with negative residuals", color = "blue3") +
labs(x = expression("NEGEMOT: Negative emotions about climate change ("*italic("X")*")"),
y = expression("GOVACT: Support for governmentaction ("*italic("Y")*")")) +
coord_cartesian(xlim = range(glbwarm$negemot)) +
theme_bw()
```

Note how that figure is a combination of the original `glbwarm`

data and our `f`

output.

### 2.2.1 Interpretation of the constant and regression coefficient.

“The regression constant is conceptually equivalent to the \(Y\)-intercept in the equation for a line. It quantifies the estimated value of \(Y\) when \(X = 0\)” (p. 39).

### 2.2.2 The standardized regression model.

Thus far, the interpretation of the regression coefficients in a regression model has been couched in

unstandardizedorraw metricform. Many regression routines will also produce a version of the model instandardizedform. The standardized regression model is what results when all variables are first standardized prior to estimation of the model by expressing each measurement in units of standard deviations from the sample mean. (p. 40,emphasisin the original)

**brms** will not produce standardized solutions on the fly. To get them, you will have to manually standardize the variables before entering them into the model.

### 2.2.3 Simple linear regression with a dichotomous antecedent variable.

In the `glbwarm`

data, `sex`

is a dichotomous variable.

```
%>%
glbwarm ggplot(aes(x = sex)) +
geom_bar() +
theme_bw()
```

In these data, `sex`

is coded females = 0, males = 1. Here we add `sex`

to the model.

```
.4 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + sex,
govact cores = 4,
file = "fits/model02.04")
```

Check the summary.

`print(model2.4)`

```
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: govact ~ 1 + sex
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.72 0.07 4.58 4.84 1.00 4426 3139
## sex -0.27 0.10 -0.45 -0.09 1.00 4394 3241
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.36 0.03 1.30 1.42 1.00 4813 2607
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

Our model summary is very close to that in the text. If you just wanted the coefficients, you might use the `fixef()`

function.

`fixef(model2.4) %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## Intercept 4.718 0.067 4.584 4.844
## sex -0.267 0.095 -0.452 -0.086
```

Though not necessary, we used the `round()`

function to reduce the number of significant digits in the output. You can get a little more information with the `posterior_summary()`

function.

But since Bayesian estimation yields an entire posterior distribution, you can visualize that distribution in any number of ways. Because we’ll be using **ggplot2**, we’ll need to put the posterior draws into a data frame before plotting.

`<- posterior_samples(model2.4) post `

We could summarize the posterior with box plots

```
%>%
post rename(female = b_Intercept) %>%
mutate(male = female + b_sex) %>%
pivot_longer(cols = c(male, female)) %>%
ggplot(aes(x = name, y = value)) +
geom_boxplot(aes(fill = name)) +
theme_bw() +
theme(legend.position = "none")
```

or with overlapping density plots

```
%>%
post rename(female = b_Intercept) %>%
mutate(male = female + b_sex) %>%
pivot_longer(cols = c(male, female)) %>%
ggplot(aes(x = value, group = name, fill = name)) +
geom_density(color = "transparent", alpha = 3/4) +
theme_bw()
```

or even with violin plots with superimposed posterior medians and 95% intervals.

```
%>%
post rename(female = b_Intercept) %>%
mutate(male = female + b_sex) %>%
pivot_longer(cols = c(male, female)) %>%
ggplot(aes(x = name, y = value)) +
geom_violin(aes(fill = name), color = "transparent", alpha = 3/4) +
stat_summary(fun = median,
fun.min = function(i){quantile(i, probs = .025)},
fun.max = function(i){quantile(i, probs = .975)}) +
theme_bw() +
theme(legend.position = "none")
```

For even more ideas, see Matthew Kay’s **tidybayes** and **ggdist** packages.

As Hayes discussed on page 42, you can also get a sense of the model estimates for women and men with a little addition. Here we continue to use the `round()`

function to simplify the output.

```
# for women
round(fixef(model2.4)[1, ], digits = 2)
```

```
## Estimate Est.Error Q2.5 Q97.5
## 4.72 0.07 4.58 4.84
```

```
# for men
round(fixef(model2.4)[1, ] + fixef(model2.4)[2, ], digits = 2)
```

```
## Estimate Est.Error Q2.5 Q97.5
## 4.45 0.16 4.13 4.76
```

Hayes then considered that

although the model will always generate the group means, the regression coefficient and regression constant will depend on how the two groups are coded. For instance, suppose females were coded \(X = −1\) and males were coded \(X = 1\). (p. 42)

To follow along, we’ll first recode `sex`

, saving it as `sex_recode`

.

```
<-
glbwarm %>%
glbwarm mutate(sex_recode = if_else(sex == 0, -1, 1))
```

Now fit the new model.

```
.5 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + sex_recode,
govact cores = 4,
file = "fits/model02.05")
```

Check the primary coefficients.

`fixef(model2.5)`

```
## Estimate Est.Error Q2.5 Q97.5
## Intercept 4.5836570 0.04712046 4.4894107 4.67767899
## sex_recode -0.1341268 0.04747433 -0.2239347 -0.04134183
```

They match up well with the results in the middle of page 42. Now the `Intercept`

in the output, what Hayes called \(i_Y\) is the unweighted mean of the means, \(\big (\overline Y_\text{male} + \overline Y_\text{female} \big ) \big / 2\), and the coefficient `sex_recode`

(i.e., what Hayes called \(b\)) is one half the difference between those means. Here’s how to work with the posterior draws from this model to reproduce the group mean estimates.

```
posterior_samples(model2.5) %>%
transmute(male = b_Intercept + b_sex_recode * 1,
female = b_Intercept + b_sex_recode * -1) %>%
pivot_longer(everything()) %>%
ggplot(aes(x = value, group = name, fill = name)) +
geom_density(color = "transparent", alpha = 3/4) +
theme_bw()
```

You’ll see it looks just like the plot from above.

#### 2.2.3.1 A caution about the standardized regression coefficient.

The standardized regression coefficient is a function of both the mean difference and the distribution of the cases across the groups. This is an undesirable property of \(\tilde b\) when \(X\) is dichotomous.

I recommend that the standardized regression coefficient for a dichotomous antecedent variable not be interpreted or reported.If you desire an index of a mean difference in standard deviation units, I recommend standardizing \(Y\) but not the dichotomous \(X\) and then interpreting the

unstandardizedregression coefficient in a model estimating \(Z_Y\) from \(X\). In such a model, \(b\) is apartiallystandardized regression coefficient. (pp. 43–44,emphasisin the original)

Here’s how to fit the partially-standardized model, first with `lm()`

.

```
# standardize the Y
<-
glbwarm %>%
glbwarm mutate(govact_z = (govact - mean(govact)) / sd(govact))
# fit the model
lm(data = glbwarm, govact_z ~ 1 + sex)
```

```
##
## Call:
## lm(formula = govact_z ~ 1 + sex, data = glbwarm)
##
## Coefficients:
## (Intercept) sex
## 0.09629 -0.19717
```

Now we’ll fit the model as Bayesians with `brms::brm()`

.

```
.6 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + sex,
govact_z cores = 4,
file = "fits/model02.06")
```

`fixef(model2.6)`

```
## Estimate Est.Error Q2.5 Q97.5
## Intercept 0.09806048 0.04916014 0.002503287 0.19352628
## sex -0.19819950 0.07238316 -0.338236658 -0.05537654
```

The constant \(i_Y\) is the mean standardized \(Y\) for females, and \(b\) is the mean difference between males and females in standard deviations of \(Y\). So men are estimated to differ from women by 0.197 standard deviations in their support for government action. The negative sign for \(b\) means males are lower than females, on average, in their support. (p. 44)

The result from our Bayesian model was the same as the OLS results up to two decimal places. This will often be the case. One is not more correct. They are the results of using different procedures.

### 2.2.4 A note on symbolic representations.

This section is worthy of repeating in full.

A brief digression is in order at this point. It is important when reporting the results of an analysis to define the symbols you use unless there is a strong convention, for a failure to do so can invite confusion. Different uses of \(b\) and \(\beta\) in regression analysis are an important case in point. There is much inconsistency in the substantive and methodology literature as to how regression coefficients are symbolized in unstandardized versus standardized form. Some use \(b\) or \(B\) to refer to the unstandardized regression coefficient and \(\beta\) to refer to the standardized regression coefficient. Others, rather than using \(\beta\), spell it out by referencing “beta weights” or just talk about the “betas.” Some use \(\beta\) to refer to a population regression coefficient, to distinguish it from a sample estimate, others use \(\beta\) as the unstandardized regression weight, and there are still others who use \(\hat \beta\) to refer to a sample unstandardized regression coefficient and leave the hat off for its corresponding population or “true” value. In this book, I use \(\tilde \beta\) for the standardized regression weight.

Ultimately, the symbols we use are for the most part arbitrary. We can use any symbols we want. My point is that you should not assume others will know what symbols you use mean, for your familiar symbols to represent certain concepts may not be understood as representing those concepts by all. The same applies to terms such as “beta coefficient” or other verbalizations of symbols. Best to define your symbols in advance, or otherwise let your reader know what your symbols mean when used in text and tables. This will help others better understand and interpret your work. (p. 44)

## 2.3 Alternative explanations for association

That “correlation does not imply causation” is etched into the brains of all scientists. If variables \(X\) and \(Y\) are correlated, that doesn’t mean that \(X\) causes \(Y\) or that \(Y\) causes \(X\). The ability to infer cause–effect is not even a statistical matter in the end. Rather, it is the design of one’s study, the data collection procedures one employs, and theoretical plausibility that most directly influence whether a cause–effect claim can be made and with what degree of confidence, not the size or sign of a statistical index of association. (p. 45)

On page 46, Hayes produced a couple correlations. Here’s how to get them from base **R**.

`cor(glbwarm$sex, glbwarm$negemot)`

`## [1] -0.1173564`

`cor(glbwarm$sex, glbwarm$govact)`

`## [1] -0.09861854`

Again, if we wanted to get full Bayesian estimates, we’d fit an intercept-only multivariate model.

```
.7 <-
model2brm(data = glbwarm,
family = gaussian,
mvbind(negemot, govact, sex) ~ 1,
cores = 4,
file = "fits/model02.07")
```

`print(model2.7, digits = 3)`

```
## Family: MV(gaussian, gaussian, gaussian)
## Links: mu = identity; sigma = identity
## mu = identity; sigma = identity
## mu = identity; sigma = identity
## Formula: negemot ~ 1
## govact ~ 1
## sex ~ 1
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## negemot_Intercept 3.559 0.053 3.459 3.662 1.000 4218 3306
## govact_Intercept 4.588 0.048 4.496 4.683 1.001 4700 3239
## sex_Intercept 0.488 0.017 0.455 0.521 1.000 5659 3163
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_negemot 1.531 0.037 1.459 1.605 1.001 4469 3391
## sigma_govact 1.362 0.033 1.298 1.430 1.000 4338 3168
## sigma_sex 0.502 0.012 0.479 0.526 1.001 6077 2916
##
## Residual Correlations:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## rescor(negemot,govact) 0.576 0.024 0.528 0.620 1.002 4178 3378
## rescor(negemot,sex) -0.117 0.034 -0.186 -0.051 1.001 6999 3105
## rescor(govact,sex) -0.099 0.033 -0.163 -0.033 1.000 6400 3478
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

For our purposes, the action is in the ‘rescor(\(i\), \(j\))’ portions of the ‘Family Specific Parameters’ section.

Anyway, if you wanted to get all the Pearson’s correlations among the `glbwarm`

variables, rather than piecewise `cor()`

approach, you could use the `lowerCor()`

function from the **psych** package (Revelle, 2021).

`::lowerCor(glbwarm[, 1:7], digits = 3) psych`

```
## govct posmt negmt idlgy age sex prtyd
## govact 1.000
## posemot 0.043 1.000
## negemot 0.578 0.128 1.000
## ideology -0.418 -0.029 -0.349 1.000
## age -0.097 0.042 -0.057 0.212 1.000
## sex -0.099 0.074 -0.117 0.133 0.166 1.000
## partyid -0.360 -0.036 -0.324 0.619 0.154 0.109 1.000
```

## 2.4 Multiple linear regression

The simple linear regression model is easily extended to the estimation of a consequent variable using more than one antecedent variable. Including more than one antecedent in a regression model allows you to simultaneously investigate the role of multiple influences on a consequent variable. An additional and important benefit of the multiple regression model is that it provides various measures of

partial associationthat quantify the component of the association between an antecedent and a consequent that is unique to that antecedent relative to other antecedent variables in the model. (p. 48,emphasisin the original)

Using Hayes’ notation, the model we’re about follows the basic equation

\[\hat Y = i_Y + b_1 X_1 + b_2 X_2 + b_3 X_3 + b_4 X_4 + b_4 X_5.\]

For us, there’s technically more involved because our Bayesian paradigm includes priors, which we’re not focusing on at the moment. Anyway, there’s nothing particularly special about jumping from univariable to multivariable models with **brms**. You just keep tacking on predictors with the `+`

operator.

```
.8 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + negemot + posemot + ideology + sex + age,
govact cores = 4,
file = "fits/model02.08")
```

`print(model2.8)`

```
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: govact ~ 1 + negemot + posemot + ideology + sex + age
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.06 0.20 3.67 4.45 1.00 5516 3355
## negemot 0.44 0.03 0.39 0.49 1.00 5423 3142
## posemot -0.03 0.03 -0.08 0.03 1.00 5623 2992
## ideology -0.22 0.03 -0.27 -0.17 1.00 5696 2899
## sex -0.01 0.08 -0.16 0.14 1.00 5693 2963
## age -0.00 0.00 -0.01 0.00 1.00 5491 3287
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.07 0.03 1.02 1.12 1.00 6250 2795
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

Following Hayes on pages 50 and 51, here is the posterior mean (i.e., what you might call the Bayesian point estimate) for someone with

- negative emotions = 3,
- positive emotions = 4,
`ideology`

= 2,- is male (i.e.,
`sex`

= 1), and - is 30 years of
`age`

.

```
fixef(model2.8)[1] +
fixef(model2.8)[2] * 3 +
fixef(model2.8)[3] * 4 +
fixef(model2.8)[4] * 2 +
fixef(model2.8)[5] * 1 +
fixef(model2.8)[6] * 30
```

`## [1] 4.794097`

Here’s the same deal for a man of the same profile, but with one point higher on `negemot`

.

```
fixef(model2.8)[1] +
fixef(model2.8)[2] * 4 +
fixef(model2.8)[3] * 4 +
fixef(model2.8)[4] * 2 +
fixef(model2.8)[5] * 1 +
fixef(model2.8)[6] * 30
```

`## [1] 5.234862`

If you want a full expression of the model uncertainty in terms of the shape of the posterior distribution and the 95% intervals, you’ll probably just want to use `posterior_samples()`

and do a little data processing.

```
<- posterior_samples(model2.8)
post
<-
post %>%
post mutate(our_posterior = b_Intercept + b_negemot * 4 + b_posemot * 4 + b_ideology * 2 + b_sex * 1 + b_age * 30)
# this intermediary step will make it easier to specify the break points and their labels for the x-axis
<-
post_summary quantile(post$our_posterior, probs = c(.025, .5, .975)) %>%
as_tibble() %>%
mutate(labels = value %>%
round(digits = 3) %>%
as.character())
# plot!
ggplot(data = post,
aes(x = our_posterior)) +
geom_density(fill = "black") +
geom_vline(xintercept = post_summary$value,
size = c(.5, .75, .5), linetype = c(2, 1, 2), color = "white") +
scale_x_continuous(NULL,
breaks = post_summary$value,
labels = post_summary$labels) +
scale_y_continuous(NULL, breaks = NULL) +
labs(subtitle = "The expected govact score for a 30-year-old man for\nwhom negemot and posemot both equal 4 and ideology\nequals 2. The solid and dashed white vertical lines are the\nposterior median and 95% intervals, respectively.") +
theme_bw()
```

In the text, Hayes showed that individuals based on these two profiles would be expected to differ by 0.441 (i.e., \(5.244 - 4.803 = 0.441\)). That’s fine if you’re only working with OLS point estimates. But a proper Bayesian approach would express the difference in terms of an entire poster distribution, or at least a point estimate accompanied by some sort of intervals. Here we’ll just work with the posterior to create a difference distribution. You could do that with a little deft `posterior_samples()`

wrangling. Here we’ll employ `fitted()`

.

```
<-
nd tibble(negemot = c(3, 4),
posemot = 4,
ideology = 2,
sex = 1,
age = 30)
fitted(model2.8,
newdata = nd,
summary = F) %>%
data.frame() %>%
set_names(str_c("condition_", letters[1:2])) %>%
mutate(difference = condition_b - condition_a) %>%
ggplot(aes(x = difference)) +
geom_density(fill = "black", color = "transparent") +
scale_y_continuous(NULL, breaks = NULL) +
ggtitle("The posterior density for the difference between\nthe two conditions.") +
theme_bw()
```

### 2.4.1 The standardized regression model.

As **brms** doesn’t automatically give us the standardized coefficients the way OLS output often does, we’ll have to be proactive. One solution is just to standardized the data themselves and then re-fit the model with those standardized variables. That leads us to the issue of how one standardized variables to begin with. Recall that standardizing entails subtracting the mean of a variable from that variable and then dividing that value by the standard deviation. We don’t want to do that by hand. So one handy way is to make a custom function to do that work for us.

```
<- function(x) {
sandardize - mean(x)) / sd(x)
(x }
```

To learn more about making custom functions in **R**, check our Chapter 19 of *R4DS*.

Here we’ll employ our custom `standardize()`

function to make standardized versions of our variables.

```
<-
glbwarm %>%
glbwarm mutate(posemot_z = sandardize(posemot),
negemot_z = sandardize(negemot),
ideology_z = sandardize(ideology),
sex_z = sandardize(sex),
age_z = sandardize(age))
```

Now we’ve got us our standardized variables, let’s fit a standardized model.

```
.9 <-
model2brm(data = glbwarm,
family = gaussian,
~ 1 + negemot_z + posemot_z + ideology_z + sex_z + age_z,
govact_z cores = 4,
file = "fits/model02.09")
```

Here are the newly standardized coefficient summaries, minus the `Intercept`

.

`fixef(model2.9)[-1, ] %>% round(3)`

```
## Estimate Est.Error Q2.5 Q97.5
## negemot_z 0.495 0.030 0.436 0.554
## posemot_z -0.026 0.027 -0.079 0.029
## ideology_z -0.242 0.030 -0.300 -0.184
## sex_z -0.004 0.028 -0.059 0.050
## age_z -0.016 0.029 -0.071 0.040
```

Our coefficients match up nicely with those on page 52 in the text. Just as with Hayes’s OLS estimates, we should not attempt to interpret the standardized `sex_z`

coefficient from our Bayesian model.

Here’s how we’d fit a *partially*-standardized model–a model in which all variables except for `sex`

are standardized.

```
.10 <-
model2update(model2.9,
newdata = glbwarm,
formula = govact_z ~ 1 + negemot_z + posemot_z + ideology_z + sex + age_z,
cores = 4,
file = "fits/model02.10")
```

Notice our use of the `update()`

function. If you want to hastily fit a **brms** model of the same basic form of a prior model, you can just update some of the parameters of that original fit. In this case, all we did was swap out one of the predictors. To accommodate that change, we used the `newdata`

argument so the model had access to the new variable and then we fed in the new formula.

Anyway, here are the coefficient summaries, including the `Intercept`

, for the partially-standardized model.

`fixef(model2.10) %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## Intercept 0.004 0.040 -0.073 0.081
## negemot_z 0.494 0.030 0.434 0.553
## posemot_z -0.026 0.028 -0.079 0.027
## ideology_z -0.242 0.031 -0.304 -0.180
## sex -0.007 0.057 -0.124 0.106
## age_z -0.016 0.028 -0.071 0.039
```

As Hayes wrote, now `sex`

= -0.007 has a sensible interpretation. “We can say that men and women differ by [-0.007] standard deviations in their support for government action when all other variables in the model are held constant (p. 53).”

On page 54, Hayes gave us the equation to transform unstandardized coefficients to standardized ones:

\[ \tilde b_i = b_i \left ( \frac{SD_{X_{i}}}{SD_{Y}} \right ) \]

Let’s give it a whirl with `negemot`

.

```
# here's the coefficient for `negemot` from the standardized model, `model2.9`
fixef(model2.9)["negemot_z", "Estimate"]
```

`## [1] 0.4950501`

```
# here's the coefficient for `negemot` from the unstandardized model, `model2.8`
fixef(model2.8)["negemot", "Estimate"]
```

`## [1] 0.4407648`

```
# and here we use Hayes' formula to standardize the unstandardized coefficient
fixef(model2.8)["negemot", "Estimate"] * (sd(glbwarm$negemot) / sd(glbwarm$govact))
```

`## [1] 0.4951932`

Looks like we got it within rounding error–pretty good! However, that was just the posterior mean, the Bayesian point estimate. If we want to more fully express the uncertainty around the mean–and we do–, we’ll need to work with the posterior draws.

```
# the posterior draws from the unstandardized model
posterior_samples(model2.8) %>%
# using Hayes' formula to standardize `b_negemot`
mutate(`hand-made b_negemot_z` = b_negemot * (sd(glbwarm$negemot) / sd(glbwarm$govact))) %>%
# tacking on the `b_negemot_z` column from the standardized `model2.9` models posterior draws
bind_cols(posterior_samples(model2.9) %>%
select(b_negemot_z)) %>%
# converting the data to the long format and grouping by `name`
pivot_longer(c(`hand-made b_negemot_z`, b_negemot_z)) %>%
group_by(name) %>%
# here we summarize the results
summarise(mean = mean(value),
sd = sd(value),
ll = quantile(value, probs = .025),
ul = quantile(value, probs = .975)) %>%
mutate_if(is.double, round, digits = 3)
```

```
## # A tibble: 2 x 5
## name mean sd ll ul
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 b_negemot_z 0.495 0.03 0.436 0.554
## 2 hand-made b_negemot_z 0.495 0.03 0.436 0.552
```

Our summary confirms that we can apply Hayes’s formula to a `posterior_samples()`

column in order to get fuller summary statistics for a hand-converted standardized coefficient. This would be in full compliance with, say, APA recommendations to include 95% intervals with all effect sizes–the standardized regression coefficient being the effect size, here.

## 2.5 Measures of model fit

In the Bayesian world, we don’t tend to appeal to the \(SS_{residual}\), the \(MS_{residual}\), or the standard error of estimate. We do sometimes, however, appeal to the \(R^2\). I’m not going to go into the technical details here, but you should be aware that the Bayesian \(R^2\) is not calculated the same as the OLS \(R^2\). If you want to dive in, check out the (2019) paper by Gelman, Goodrich, Gabry, and Vehtari, *R-squared for Bayesian regression models*. Here’s how to get it with **brms**.

`bayes_R2(model2.8, summary = T) %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2 0.389 0.021 0.347 0.428
```

It even comes with 95% intervals, which will make the editors at APA journals happy. If you want to go beyond summary statistics and take a look at the full posterior, just set `summary = F`

and data wrangle and plot as usual.

```
bayes_R2(model2.8, summary = F) %>%
data.frame() %>%
ggplot(aes(x = R2)) +
geom_density(fill = "black", color = "transparent") +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(paste("Behold: The Bayesian ", italic("R")^{2}, " distribution for model 2.8")),
x = NULL) +
coord_cartesian(xlim = 0:1) +
theme_bw()
```

Another way we examine model fit is with graphical posterior predictive checks. Posterior predictive checking is a very general approach, which you might learn more about from Gabry et al. (2019) or with a few keyword searches in on Gelman’s blog. One basic way is to use the model in order to simulate data and then compare those data with the original data–the basic idea being that good fitting models should produce data similar to the original data.

Recall how we’ve used `fitted()`

to make regression lines and expected values? We’ll, now we’ll use `predict()`

to simulate data based on our models.

```
predict(model2.8,
summary = F,
nsamples = 3) %>%
data.frame() %>%
set_names(1:nrow(glbwarm)) %>%
mutate(simulation = str_c("simulation: ", 1:n())) %>%
pivot_longer(-simulation,
names_to = "row",
values_to = "govact") %>%
bind_cols(
bind_rows(
%>% select(-govact),
glbwarm %>% select(-govact),
glbwarm %>% select(-govact))
glbwarm %>%
)
ggplot(aes(x = negemot, y = govact)) +
geom_jitter(height = .05, width = .05,
alpha = 1/2, size = 1/3) +
coord_cartesian(ylim = c(0, 9)) +
theme_bw() +
facet_wrap(~ simulation, ncol = 3)
```

The question is, do these simulated data sets look like the original data? Let’s see.

```
%>%
glbwarm ggplot(aes(x = negemot, y = govact)) +
geom_jitter(height = .05, width = .05,
alpha = 1/2, size = 1/3) +
coord_cartesian(ylim = c(0, 9)) +
theme_bw()
```

Overall, the simulations aren’t bad. But in all three `govact`

tends to veer above 7.5, which is where the original data appear to be bounded. But otherwise the overall shape is pretty close, at least with respect to `negemot`

.

There’s nothing special about three simulations. Three is just more than one and gives you a sense of the variance across simulations. Also, we only examined the model fit with respect to `negemot`

. Since there are other variables in the model, we might also assess the model based on them.

Another method is with the `brms::pp_check()`

function, which allows users to access a variety of convenience functions from the **bayesplot** package (Gabry et al., 2019; Gabry & Mahr, 2021). Here we’ll use the default settings and just tack on `theme_bw()`

for aesthetics.

```
pp_check(model2.8) +
theme_bw()
```

What we did was simulate 10 data sets worth of `govact`

values, plot their densities (i.e., the thin blue lines) and compare them with the density of the original `govact`

values. What we want is for the thin blue lines to largely align with the thick blue line. Though not perfect, the simulations from our `model2.8`

did a pretty okay job of reproducing the original `govact`

distribution. For more ideas on this method, see the **brms** reference manual (Bürkner, 2021e) and Gabry’s (2019) vignette, *Graphical posterior predictive checks using the bayesplot package*.

## 2.6 Statistical inference

Here’s a **tidyverse** way to do Hayes’ simulation from page 57. We’re just using OLS regression with the `lm()`

function. You could do this with Bayesian HMC estimation, but man would it take a while. For our first step, we’ll define two custom functions.

```
# this first one will use the `slice_sample()` function to randomly sample from `glbwarm`
<- function(i) {
make_sample set.seed(i)
slice_sample(glbwarm, n = 50, replace = F)
}
# this second function will fit our model, the same one from `model2.8`, to each of our subsamples
<- function(df) {
glbwarm_model lm(govact ~ 1 + negemot + posemot + ideology + sex + age, data = df)
}
```

Now we’ll run the simulation, wrangle the output, and plot in one fell swoop.

```
# we need an iteration index, which will double as the values we set our seed with in our `make_sample()` function
tibble(iter = 1:1e4) %>%
group_by(iter) %>%
# inserting our subsamples
mutate(sample = map(iter, make_sample)) %>%
# fitting our models
mutate(model = map(sample, glbwarm_model)) %>%
# taking those model results and tidying them with the broom package
mutate(broom = map(model, broom::tidy)) %>%
# unnesting allows us to access our model results
unnest(broom) %>%
# we're only going to focus on the estimates for `negemot`
filter(term == "negemot") %>%
# here it is, Figure 2.7
ggplot(aes(x = estimate)) +
geom_histogram(binwidth = .025, boundary = 0) +
labs(x = "Unstandardized regression coefficient for negemot",
y = "Frequency in 1e4 samples of size 50") +
theme_bw()
```

To learn more about this approach to simulations, see Section 25.2.1 in *R4DS*.

### 2.6.1 Testing a null hypothesis.

As Bayesians, we don’t need to wed ourselves to the null hypothesis. We’re not interested in the probability of the data given the null hypothesis. Bayes’ rule,

\[p(\theta | D) = \frac{p(D | \theta) p(\theta)}{p(D)},\]

gives us the probability of the model parameters, given the data. Though I acknowledge different researchers might set out to ask different things of their data, I propose we’re generally more interested in determining the most probable parameter values than we are the probabilities tested within the NHST paradigm.

```
posterior_samples(model2.8) %>%
ggplot(aes(x = b_negemot)) +
geom_density(fill = "black", color = "transparent") +
geom_vline(xintercept = posterior_interval(model2.8)["b_negemot", ],
color = "white", linetype = 2) +
scale_x_continuous(breaks = posterior_interval(model2.8)["b_negemot", ] %>% as.double(),
labels = posterior_interval(model2.8)["b_negemot", ] %>% as.double() %>% round(digits = 2) %>% as.character()) +
scale_y_continuous(NULL, breaks = NULL) +
labs(subtitle = "The most probable values for our b_negemot parameter are the ones around the peak\nof the density. For convenience, the dashed lines denote the 95% credible intervals.\nSure, you could ask yourself, 'Is zero within those intervals?' But with such rich output,\nthat seems like an impoverished question to ask.") +
theme_bw()
```

For more discussion the NHST paradigm and how it compares with variations of Bayesian statistics, check out Kruschke’s (2015) text, *Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan*, particularly chapters 10 through 13. You can find my (2020a) **brms** + **tidyverse** translation of that text at https://bookdown.org/content/3686/.

### 2.6.2 Interval estimation.

Within the Bayesian paradigm, we don’t use 95% intervals based on the typical frequentist formula. With the **brms** package, we typically use percentile-based intervals. Take the 95% credible intervals for the `negemot`

coefficient from model `model2.8`

:

`posterior_interval(model2.8)["b_negemot", ]`

```
## 2.5% 97.5%
## 0.3882826 0.4911189
```

We can actually get those intervals with the simple use of the base **R** `quantile()`

function.

```
posterior_samples(model2.8) %>%
summarize(the_2.5_percentile = quantile(b_negemot, probs = .025),
the_97.5_percentile = quantile(b_negemot, probs = .975))
```

```
## the_2.5_percentile the_97.5_percentile
## 1 0.3882826 0.4911189
```

The consequence of this is that our Bayesian credible intervals aren’t necessarily symmetric, which is fine because the posterior distribution for a given parameter isn’t always symmetric. But not all Bayesian intervals are percentile based. John Kruschke, for example, often recommends highest posterior density intervals in his work. The **brms** package doesn’t have a convenience function for these, but you can compute them with help from the **HDInterval** package (Meredith & Kruschke, 2018).

```
library(HDInterval)
hdi(posterior_samples(model2.8)[ , "b_negemot"], credMass = .95)
```

```
## lower upper
## 0.3895321 0.4920354
## attr(,"credMass")
## [1] 0.95
```

Finally, because Bayesians aren’t bound to the NHST paradigm, we aren’t bound to 95% intervals, either. For example, in both his excellent (2020) text and as a default in its accompanying **rethinking** package, Richard McElreath often uses 89% intervals. Alternatively, Andrew Gelman has publicly advocated for 50% intervals. The most important thing is to express the uncertainty in the posterior in a clearly-specified way. If you’d like, say, 80% intervals in your model summary, you can insert a `prob`

argument into either `print()`

or `summary()`

.

`print(model2.8, prob = .8)`

```
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: govact ~ 1 + negemot + posemot + ideology + sex + age
## Data: glbwarm (Number of observations: 815)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-80% CI u-80% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.06 0.20 3.80 4.33 1.00 5516 3355
## negemot 0.44 0.03 0.41 0.47 1.00 5423 3142
## posemot -0.03 0.03 -0.06 0.01 1.00 5623 2992
## ideology -0.22 0.03 -0.25 -0.18 1.00 5696 2899
## sex -0.01 0.08 -0.11 0.09 1.00 5693 2963
## age -0.00 0.00 -0.00 0.00 1.00 5491 3287
##
## Family Specific Parameters:
## Estimate Est.Error l-80% CI u-80% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.07 0.03 1.03 1.10 1.00 6250 2795
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

Note how two of our columns changed to ‘l-80% CI’ and ‘u-80% CI.’

You can specify custom percentile levels with `posterior_summary()`

:

```
posterior_summary(model2.8, probs = c(.9, .8, .7, .6, .4, .3, .2, .1)) %>%
round(digits = 2)
```

```
## Estimate Est.Error Q90 Q80 Q70 Q60 Q40 Q30 Q20 Q10
## b_Intercept 4.06 0.20 4.33 4.24 4.17 4.11 4.01 3.96 3.90 3.80
## b_negemot 0.44 0.03 0.47 0.46 0.45 0.45 0.43 0.43 0.42 0.41
## b_posemot -0.03 0.03 0.01 0.00 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06
## b_ideology -0.22 0.03 -0.18 -0.20 -0.21 -0.21 -0.23 -0.23 -0.24 -0.25
## b_sex -0.01 0.08 0.09 0.05 0.03 0.01 -0.03 -0.05 -0.07 -0.11
## b_age 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## sigma 1.07 0.03 1.10 1.09 1.08 1.08 1.06 1.05 1.05 1.03
## lp__ -1213.20 1.81 -1211.16 -1211.65 -1212.07 -1212.44 -1213.36 -1213.91 -1214.61 -1215.75
```

And of course, you can use multiple interval summaries when you `summarize()`

the output from `posterior_samples()`

. E.g.,

```
posterior_samples(model2.8) %>%
select(b_Intercept:b_age) %>%
gather() %>%
group_by(key) %>%
summarize(`l-95% CI` = quantile(value, probs = .025),
`u-95% CI` = quantile(value, probs = .975),
`l-50% CI` = quantile(value, probs = .25),
`u-50% CI` = quantile(value, probs = .75)) %>%
mutate_if(is.double, round, digits = 2)
```

```
## # A tibble: 6 x 5
## key `l-95% CI` `u-95% CI` `l-50% CI` `u-50% CI`
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 b_age -0.01 0 0 0
## 2 b_ideology -0.27 -0.17 -0.24 -0.2
## 3 b_Intercept 3.67 4.45 3.93 4.2
## 4 b_negemot 0.39 0.49 0.42 0.46
## 5 b_posemot -0.08 0.03 -0.05 -0.01
## 6 b_sex -0.16 0.14 -0.06 0.04
```

Throughout this project, I’m going to be lazy and default to conventional 95% intervals, with occasional appearances of 50% intervals.

### 2.6.3 Testing a hypothesis about a set of antecedent variables.

Here we’ll make use of the `update()`

function to hastily fit our reduced model, `model2.11`

.

```
.11 <-
model2update(model2.8,
~ 1 + ideology + sex + age,
govact cores = 4,
file = "fits/model02.11")
```

We can get a look at the \(R^2\) summaries for our competing models like this.

`bayes_R2(model2.8) %>% round(digits = 2)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2 0.39 0.02 0.35 0.43
```

`bayes_R2(model2.11) %>% round(digits = 2)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2 0.18 0.02 0.14 0.22
```

So far it looks like our fuller model, `model2.8`

, explains more variation in the data. If we wanted to look at their distributions, we’d set `summary = F`

in the `bayes_R2()`

function and convert the results to a data frame or tibble. Here we use `bind_cols()`

to put the \(R^2\) results for both in the same tibble.

```
<-
r2 cbind(bayes_R2(model2.8, summary = F),
bayes_R2(model2.11, summary = F)) %>%
data.frame() %>%
set_names(str_c("model2.", c(8, 11)))
head(r2)
```

```
## model2.8 model2.11
## 1 0.3790275 0.1827538
## 2 0.4143660 0.1721092
## 3 0.4014884 0.2068029
## 4 0.3846665 0.1727220
## 5 0.4179583 0.1710993
## 6 0.3378314 0.1592348
```

With our `r2`

tibble in hand, we’re ready to plot.

```
%>%
r2 gather() %>%
ggplot(aes(x = value, fill = key)) +
geom_density(color = "transparent") +
scale_y_continuous(NULL, breaks = NULL) +
xlab(expression(paste(italic(R)^2))) +
coord_cartesian(xlim = 0:1) +
theme_bw()
```

Yep, the \(R^2\) distribution for `model2.8`

, the one including the emotion variables, is clearly larger than that for the more parsimonious `model2.11`

. And it’d just take a little more data wrangling to get a formal \(R^2\) difference score.

```
%>%
r2 mutate(difference = model2.11 - model2.8) %>%
ggplot(aes(x = difference)) +
geom_density(fill = "black", color = "transparent") +
scale_y_continuous(NULL, breaks = NULL) +
coord_cartesian(xlim = -1:0) +
labs(title = expression(paste(Delta, italic(R)^{2})),
subtitle = expression(paste("This is the amount the ", italic(R)^{2}, " dropped after pruning the emotion variables from the model.")),
x = NULL) +
theme_bw()
```

The \(R^2\) approach is popular within the social sciences. But it has its limitations, the first of which is that it doesn’t correct for model complexity. The second is it’s not applicable to a range of models, such as those that do not use the Gaussian likelihood (e.g., logistic regression) or to multilevel models.

Happily, information criteria offer a more general framework. The AIC is the most popular information criteria among frequentists. Within the Bayesian world, we have the DIC, the WAIC, and the LOO. The DIC is quickly falling out of favor and is not immediately available with the **brms** package. However, we can use the WAIC and the LOO, both of which are computed in **brms** via the **loo** package (Vehtari et al., 2017; Vehtari, Gabry, et al., 2019; Yao et al., 2018).

With **brms**, you can compute the WAIC or LOO values and add them to your model fit object with the `add_criterion()`

function.

```
.8 <- add_criterion(model2.8, c("loo", "waic"))
model2.11 <- add_criterion(model2.11, c("loo", "waic")) model2
```

Here’s the main loo-summary output for `model4`

.

`.8$criteria$loo model2`

```
##
## Computed from 4000 by 815 log-likelihood matrix
##
## Estimate SE
## elpd_loo -1213.9 22.7
## p_loo 7.7 0.6
## looic 2427.8 45.5
## ------
## Monte Carlo SE of elpd_loo is 0.0.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
```

You get a wealth of output, more of which can be seen with `str(model2.8$loo)`

. For now, notice the “All pareto k estimates are good (k < 0.5).” Pareto \(k\) values (see Vehtari & Gabry, 2020). Each case in the data gets its own \(k\) value and we like it when those \(k\)s are low. The makers of the **loo** package get worried when those \(k\)s exceed 0.7 and as a result you will get a warning message when they do. Happily, we have no such warning messages in this example.

If you want to work with the \(k\) values directly, you can extract them and place them into a data frame like so.

```
.8$criteria$loo$diagnostics %>%
model2data.frame() %>%
head()
```

```
## pareto_k n_eff
## 1 -0.04436335 5230.019
## 2 -0.01404872 6028.208
## 3 0.17757442 5683.164
## 4 0.26825902 3154.180
## 5 -0.08003852 5829.860
## 6 0.07320749 5704.676
```

The `pareto_k`

values can be used to examine cases that are overly-influential on the model parameters, something akin to a Cook’s \(D_i\). See, for example this discussion on stackoverflow.com in which several members of the Stan team weighed in. The issue is also discussed in Vehtari et al. (2017) and in this lecture by Aki Vehtari.

But anyway, we’re getting ahead of ourselves. Back to the LOO.

Like other information criteria, the LOO values aren’t of interest in and of themselves. However, the values of one model’s LOO relative to that of another is of great interest. We generally prefer models with lower information criteria. With `compare_ic()`

, we can compute a formal difference score between multiple loo objects.

`loo_compare(model2.8, model2.11) %>% print(simplify = F)`

```
## elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
## model2.8 0.0 0.0 -1213.9 22.7 7.7 0.6 2427.8 45.5
## model2.11 -118.7 15.9 -1332.6 20.6 5.2 0.4 2665.2 41.1
```

We also get a standard error. Here it looks like `model2.8`

was substantially better, in terms of LOO-values, than `model2.11`

.

For more on the LOO, see the **loo** reference manual (Gabry, 2020), Vehtari and Gabry’s handy (2020) vignette, or the scholarly papers referenced therein. Also, McElreath discussed the LOO in the second (2020) version of his text, as well as in this lecture.

## 2.7 Multicategorical antecedent variables

“To include a multicategorical antecedent variable representing \(g\) groups in a regression model, it must be represented with \(g − 1\) variables using one of a variety of different group coding systems.” (p. 66) This isn’t strictly true, but we digress. Hayes went on…

One popular system for coding groups is

indicator coding, also known asdummy coding. With indicator coding, \(g − 1\)indicator variablescontaining either a zero or one represent which of the \(g\) groups a case belongs in, and these indicator variables are used as antecedents in a regression model. To construct indicator codes, create \(g − 1\) variables \(D_i\) for each case set to 1 if the case is in group \(i\), otherwise set \(D_i\) to zero. (p. 66,emphasisin the original)

Before we get to that, we should examine our multicategorical antecedent variable, `partyid`

. It’s coded 1 = Democrat 2 = Independent 3 = Republican. You can get a count of the cases within a give `partyid`

like this.

```
%>%
glbwarm count(partyid)
```

```
## # A tibble: 3 x 2
## partyid n
## <dbl> <int>
## 1 1 359
## 2 2 192
## 3 3 264
```

We can get grouped means for `govact`

like this.

```
%>%
glbwarm group_by(partyid) %>%
summarize(mean_support_for_governmental_action = mean(govact))
```

```
## # A tibble: 3 x 2
## partyid mean_support_for_governmental_action
## <dbl> <dbl>
## 1 1 5.06
## 2 2 4.61
## 3 3 3.92
```

We can make dummies with the `if_else()`

function. We’ll just go ahead and do that right within the `brm()`

function.

```
.12 <-
model2brm(data = glbwarm %>%
mutate(Democrat = if_else(partyid == 1, 1, 0),
Republican = if_else(partyid == 3, 1, 0)),
family = gaussian,
~ 1 + Democrat + Republican,
govact chains = 4, cores = 4,
file = "fits/model02.12")
```

Check the results.

`fixef(model2.12)`

```
## Estimate Est.Error Q2.5 Q97.5
## Intercept 4.6059516 0.09167832 4.4253983 4.7853478
## Democrat 0.4583058 0.11359349 0.2321713 0.6792878
## Republican -0.6802900 0.12127628 -0.9195148 -0.4383596
```

The intercept is the stand-in for Independents and the other two coefficients are difference scores.

The \(R^2\) is okay.

`bayes_R2(model2.12) %>% round(digits = 3)`

```
## Estimate Est.Error Q2.5 Q97.5
## R2 0.132 0.021 0.092 0.175
```

There’s no need to compute an \(F\)-test on our \(R^2\). The posterior mean and it’s 95% intervals are well away from zero. But you could use your `bayes_R2(model2.12, summary = F)`

plotting skills from above to more fully inspect the posterior if you’d like.

We could also use information criteria. One method would be to compare the WAIC or LOO value of `model2.12`

with an intercept-only model. First, we’ll need to fit that model.

```
.13 <-
model2update(model2.12,
~ 1,
govact chains = 4, cores = 4,
file = "fits/model02.13")
```

Here we’ll compute the WAIC for each, save the results to their fit objects, and them compare them with `loo_compare()`

.

```
.12 <- add_criterion(model2.12, "waic")
model2.13 <- add_criterion(model2.13, "waic")
model2
loo_compare(model2.12, model2.13, criterion = "waic") %>%
print(simplify = F)
```

```
## elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic se_waic
## model2.12 0.0 0.0 -1353.7 20.1 4.2 0.3 2707.5 40.2
## model2.13 -55.1 10.4 -1408.8 21.3 2.0 0.1 2817.6 42.6
```

The WAIC comparison suggests `model2.12`

, the one *with* the `partyid`

dummies, is an improvement over the simple intercept-only model. Another way to compare the information criteria is with AIC-type weighting. The **brms** package offers a variety of weighting methods via the `model_weights()`

function.

```
<- model_weights(model2.12, model2.13, weights = "waic")
mw
mw
```

```
## model2.12 model2.13
## 1.000000e+00 9.659371e-25
```

If you’re not a fan of scientific notation, you can put the results in a tibble and look at them on a plot.

```
%>%
mw as.data.frame() %>%
rownames_to_column() %>%
set_names(c("model", "WAIC weight")) %>%
ggplot(aes(x = `WAIC weight`, y = model)) +
geom_point() +
labs(subtitle = "The weights should sum to 1. In this case virtually all the weight is placed\nin model2.12. Recall, that these are RELATIVE weights. Add another model\nfit into the mix and the weights might well change.",
y = NULL) +
theme_bw() +
theme(axis.ticks.y = element_blank())
```

You could, of course, do all this with the LOO.

## 2.8 Assumptions for interpretation and statistical inference

Regression is a handy tool to have in your statistical toolbox. Its utility as a “general data analytic system” (Cohen, 1968) will be apparent throughout this book. But it is a human invention that isn’t perfect, it can lead you astray if used indiscriminately, and it is founded on some assumptions that aren’t always realistic or likely to be met in the circumstances in which the method is applied. (p. 68)

### 2.8.1 Linearity.

“When using OLS regression to model some consequent variable of interest \(Y\), you must be willing to assume that the relationship between the variables in the model are linear in nature, or at least approximately linear” (p. 69)

### 2.8.2 Normality.

The **brms** package is quite general and allows users to fit models from a variety of likelihoods other than the Gaussian. For example, users can accommodate outliers/extreme values with Student’s t regression. You can do count regression with the Poisson or the negative binomial… For more, see McElreath’s lecture introducing the generalized linear model or Bürkner’s (2021d) vignette, *Parameterization of response distributions in brms*

### 2.8.3 Homoscedasticity.

The **brms** package can also accommodate homoscedasticity with distributional modeling. In short, one simply models \(\sigma\) in addition to the mean, \(\mu\). See Bürkner’s handy (2021a) vignette, *Estimating distributional models with brms* on the topic.

### 2.8.4 Independence.

The issue of independence is where the multilevel model comes on. See any relevant text, such as *Statistical rethinking* or *Data analysis using regression and multilevel/hierarchical models* (Gelman & Hill, 2006). And yes, **brms** is fully capable of handling multilevel models (see Bürkner, 2017, 2018).

## Session info

`sessionInfo()`

```
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] HDInterval_0.2.2 brms_2.15.0 Rcpp_1.0.6 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.6
## [7] purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.2 ggplot2_3.3.5 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] readxl_1.3.1 backports_1.2.1 plyr_1.8.6 igraph_1.2.6 splines_4.0.4
## [6] crosstalk_1.1.0.1 TH.data_1.0-10 rstantools_2.1.1 inline_0.3.17 digest_0.6.27
## [11] htmltools_0.5.1.1 rsconnect_0.8.16 fansi_0.4.2 magrittr_2.0.1 modelr_0.1.8
## [16] RcppParallel_5.0.2 matrixStats_0.57.0 xts_0.12.1 sandwich_3.0-0 prettyunits_1.1.1
## [21] colorspace_2.0-0 rvest_1.0.1 haven_2.3.1 xfun_0.23 callr_3.7.0
## [26] crayon_1.4.1 jsonlite_1.7.2 lme4_1.1-25 survival_3.2-10 zoo_1.8-8
## [31] glue_1.4.2 gtable_0.3.0 emmeans_1.5.2-1 V8_3.4.0 pkgbuild_1.2.0
## [36] rstan_2.21.2 abind_1.4-5 scales_1.1.1 mvtnorm_1.1-1 DBI_1.1.0
## [41] miniUI_0.1.1.1 xtable_1.8-4 tmvnsim_1.0-2 stats4_4.0.4 StanHeaders_2.21.0-7
## [46] DT_0.16 htmlwidgets_1.5.3 httr_1.4.2 threejs_0.3.3 ellipsis_0.3.2
## [51] pkgconfig_2.0.3 loo_2.4.1 farver_2.1.0 sass_0.3.1 dbplyr_2.1.1
## [56] utf8_1.2.1 tidyselect_1.1.1 labeling_0.4.2 rlang_0.4.11 reshape2_1.4.4
## [61] later_1.2.0 munsell_0.5.0 cellranger_1.1.0 tools_4.0.4 cli_3.0.1
## [66] generics_0.1.0 broom_0.7.6 ggridges_0.5.3 evaluate_0.14 fastmap_1.1.0
## [71] processx_3.5.2 knitr_1.33 fs_1.5.0 nlme_3.1-152 mime_0.10
## [76] projpred_2.0.2 xml2_1.3.2 compiler_4.0.4 bayesplot_1.8.0 shinythemes_1.1.2
## [81] rstudioapi_0.13 gamm4_0.2-6 curl_4.3 reprex_2.0.0 statmod_1.4.35
## [86] bslib_0.2.4 stringi_1.6.2 highr_0.9 ps_1.6.0 Brobdingnag_1.2-6
## [91] lattice_0.20-41 Matrix_1.3-2 psych_2.1.3 nloptr_1.2.2.2 markdown_1.1
## [96] shinyjs_2.0.0 vctrs_0.3.8 pillar_1.6.1 lifecycle_1.0.0 jquerylib_0.1.4
## [101] bridgesampling_1.0-0 estimability_1.3 httpuv_1.6.0 R6_2.5.0 bookdown_0.22
## [106] promises_1.2.0.1 gridExtra_2.3 codetools_0.2-18 boot_1.3-26 colourpicker_1.1.0
## [111] MASS_7.3-53 gtools_3.8.2 assertthat_0.2.1 withr_2.4.2 mnormt_2.0.2
## [116] shinystan_2.5.0 multcomp_1.4-16 mgcv_1.8-33 parallel_4.0.4 hms_1.1.0
## [121] grid_4.0.4 coda_0.19-4 minqa_1.2.4 rmarkdown_2.8 shiny_1.6.0
## [126] lubridate_1.7.10 base64enc_0.1-3 dygraphs_1.1.1.6
```

## Footnote

### References

*Joint longitudinal and time-to-event models via Stan.*https://github.com/stan-dev/stancon_talks/

*Estimating distributional models with brms*. https://CRAN.R-project.org/package=brms/vignettes/brms_distreg.html

*Estimating multivariate models with brms*. https://CRAN.R-project.org/package=brms/vignettes/brms_multivariate.html

*Parameterization of response distributions in brms*. https://CRAN.R-project.org/package=brms/vignettes/brms_families.html

*Journal of Statistical Software*,

*80*(1), 1–28. https://doi.org/10.18637/jss.v080.i01

*The R Journal*,

*10*(1), 395–411. https://doi.org/10.32614/RJ-2018-017

*brms reference manual, Version 2.15.0*. https://CRAN.R-project.org/package=brms/brms.pdf

*Psychological Bulletin*,

*70*(6, Pt.1), 426–443. https://doi.org/10.1037/h0026714

*Graphical posterior predictive checks using the bayesplot package*. https://CRAN.R-project.org/package=bayesplot/vignettes/graphical-ppcs.html

*loo reference manual, Version 2.4.1*. https://CRAN.R-project.org/package=loo/loo.pdf

*rstanarm: Bayesian applied regression modeling via stan*[Manual]. https://CRAN.R-project.org/package=rstanarm

*bayesplot: Plotting for Bayesian models*. https://CRAN.R-project.org/package=bayesplot

*Journal of the Royal Statistical Society: Series A (Statistics in Society)*,

*182*(2), 389–402. https://doi.org/10.1111/rssa.12378

*The American Statistician*,

*73*(3), 307–309. https://doi.org/10.1080/00031305.2018.1549100

*Data analysis using regression and multilevel/hierarchical models*. Cambridge University Press. https://doi.org/10.1017/CBO9780511790942

*Introduction to mediation, moderation, and conditional process analysis: A regression-based approach*(Second edition). The Guilford Press. https://www.guilford.com/books/Introduction-to-Mediation-Moderation-and-Conditional-Process-Analysis/Andrew-Hayes/9781462534654

*Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan*. Academic Press. https://sites.google.com/site/doingbayesiandataanalysis/

*Doing Bayesian data analysis in brms and the tidyverse*(version 0.4.0). https://bookdown.org/content/3686/

*Statistical rethinking: A Bayesian course with examples in R and Stan*(Second Edition). CRC Press. https://xcelab.net/rm/statistical-rethinking/

*HDInterval: Highest (posterior) density intervals*[Manual]. https://CRAN.R-project.org/package=HDInterval

*Journal of Statistical Software*,

*85*(4), 1–30. https://doi.org/10.18637/jss.v085.i04

*blavaan: Bayesian latent variable analysis*. https://CRAN.R-project.org/package=blavaan

*psych: Procedures for psychological, psychometric, and personality research*. https://CRAN.R-project.org/package=psych

*lavaan: Latent variable analysis*[Manual]. https://lavaan.org

*RStan: The R interface to Stan*. https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html

*Using the loo package (version \(>\)= 2.0.0)*. https://CRAN.R-project.org/package=loo/vignettes/loo2-example.html

*loo: Efficient leave-one-out cross-validation and WAIC for bayesian models*. https://CRAN.R-project.org/package=loo/

*Statistics and Computing*,

*27*(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4

*Bayesian Analysis*,

*13*(3), 917–1007. https://doi.org/10.1214/17-BA1091

Hayes did not cover count regression with the Poisson likelihood in this textbook. If you’d like to learn more, one place to start is with McElreath’s lecture on binomial and Poisson models.↩︎

We will not be covering structural equation modeling (SEM) in this book If you’re interested, you might check out this video series by Erin M. Buchanan on SEM within

**R**, particularly with the handy**lavaan**package (Merkle & Rosseel, 2018; Rosseel & Jorgensen, 2019).↩︎