5  Moderation

As mentioned in the beginning of , moderation is another type of basic “causal” mechanisms we used to explain the relationship between X and Y. But moderation is more complicated and less straight forward than mediation.

5.1 The simplest scenario: when moderator is continous

5.1.1 Basic moderation model

Consider a model where X is assumed to affect/cause Y, the corresponding model is Y=β1X+ϵ.

Now we assume that the relationship between X and Y is moderated by a third variable Z, aka the moderator. A moderator is a qualitative (e.g., sex, race, class) or quantitative (e.g., level of reward) that affects the direction and/or strength of the relation between an independent or predictor variable and a dependent or criterion variable (Baron & Kenny, 1986, p. 1174).

The basic moderation model contains a product term XZ representing the moderation effect Y=β0+β1X+β2Z+β3XZ+ϵ

5.1.1.1 Non-zero mean structure

Note that, basic moderation model is a special case in which an intercept should always be included, i.e. the mean of Y is not zero (). cov(X1,X2)=E[(X1E(X1))(X2E(X2))]=E[X1X2X1E(X2)E(X1)X2+E(X1)E(X2)]=E(X1X2)2E(X1)E(X2)+E(X1)E(X2)E(X1X2)=cov(X1,X2)+E(X1)E(X2), therefore, it is easy to see that β00 even with all variables standardized E(Y)=0=β0+β1E(X)+β2E(Z)+β3E(XZ)+E(ϵ)0=β0+0+0+β3cov(X,Z)+0 this equation is solvable only if β0 is non-zero. β0 can be 0 only if X and Z are independent.

5.1.1.2 Centering

When doing moderation analysis, it is usually recommended to center the continuous independent variables (including all Xs and Z) beforehand. There are two main reasons listed in past literature:

  1. Centering variables makes interpretation easier.

For example, assume both X and Z are continuous, CX and CZ are the centered ones, the basic moderation model becomes

Y=β0+β1CX+β2CZ+β3CX×CZ+ϵ where β1 can be interpreted as the difference in the predicted value Y^ for each 1 unit change in CX, assuming CZ=0, which corresponds to the mean of Z. Thus β1 is the simple main effect of CX on Y with Z=Z¯.

  1. Centering variables reduce non-essential multicollinearity. However, Olvera Astivia & Kroc () has demonstrated that centering does not always reduce multicollinearity, making this argument less legislative.

Thus, centering is mainly for the interpretation purpose.

5.1.1.3 Why product term

Why does product term represent moderation effect?

Y=β0+β1X+β2Z+β3XZ+ϵ=β0+(β1+β3Z)X+β2Z+ϵ, it is easy to see that, by adding a product term, Z becomes part of the slope of X, thus the value of Z impacts the relationship between X and Y. For example, if Z is a dichotomous variable such as gender with value equals to either 0 (male) and 1 (female), we shall have {y=β0+β1X+ϵ,Z=0y=β0+(β1+β3)X+β2+ϵ,Z=1

Z is said to alter the strength of the relationship between X and Y, As long as β3 is significantly different from 0.

5.1.1.4 Moderation effect vs Interaction effect

What is the difference between the interaction effect and moderation effect?

Normally, these two effects are equivalent.

In the context of two-way ANOVA, interaction effect is the effect of two IVs (factors A and B) on one DV (Y). Interaction effect exists when the pattern of the relationship of one IV on DV depends on the level of another IV. As long as FAB is statistically significant, we observed strong evidence that support the interaction effect.

The interaction effect can be illustrated by testing the two simple effects of either A or B. For example, in the right panel of the figure above, the simple effect of B is insignificant at A1, whereas the simple effect of B is significant at A2, the pattern of the relationship between B and Y depends on A. Or equivalently, the simple effect of A is significant (A1>A2) at B1, whereas the simple effect of A is also significant (A1<A2) at B2, the pattern of the relationship between A and Y depends on B.

But in practice, the choice regarding which pair of simple effects to report is purely theory-driven. For example, in the figure above, we cares more about the simple effect of B, that is, A alters the strength of the relationship between B and Y. In this case, A is effectively a moderator, interaction effect of A and B on Y is equivalent to the moderation effect of A on the relationship of B on Y.

In the context of basic moderation, as long as β3 is statistically significant, we observe strong evidence that favors the existence of moderation effect. However, the moderation effect can be illustrated by either treating X as the moderator, or Z as the moderator. Because Y=β0+β1X+β2Z+β3XZ+ϵ=β0+(β1+β3Z)X+β2Z+ϵ=β0+(β2+β3X)Z+β1X+ϵ. Therefore, just like interpreting interaction effect, in practice, the determination of moderator is purely theory-driven.

5.1.1.5 Two types of path diagram

There exist two path diagrams for a basic moderation model (see the figure below), the left one represents the statistical model, corresponding to Y=β0+β1X+β2Z+β3XZ+ϵ, of which the moderator can be either X or Z; the right one represents the theory-driven conceptual model, wherein the moderator is clearly defined as Z according to theory. Statistical model always underlies conceptual model. Conceptual model is more frequently used in empirical research, whereas statistical model is the key when writting analysis syntax.

5.1.2 Moderation analysis: simple slope analysis and visualization

5.1.2.1 Continuous X

In moderation analysis, we interest the most in the moderation effect, i.e. β3. Therefore the null hypothesis in moderation analysis is

H0:β3=0

In factorial two-way ANOVA, an significant interaction effect only implies that the population means of all cells are very likely different, but the specific pattern remains unknown. We need to conduct simple effect analysis and visualize the interaction effect.

Similarly, in moderation analysis, the significance of β3^ fails to provide information about the pattern of moderation effect. We need to conduct the simple slope analysis and visualize the moderation effect.

The essence of moderation effect is that the slope of X on Y depends on Z. Simple slope is just simple main effect, the slope of X on Y at a certain level of Z.

The simple slope of a basic moderation model is β1+β3Z, the common NHST is the t-test

t=β1^+β3^Zvar(β1^)+2Zcov(β1^,β3^)+Z2var(β3^) with H0:β1+β3Z=0, and df=NK1, where K is the number of independent variables (including the product term).

It is clear that the number of simple slopes vary according to the nature of moderator. With categorical moderator, we shall have fixed number of simple slopes; with continuous moderator, we shall have infinite number of simple slopes. For continuous moderator, a popular method is pick-a-point (). The most frequently used 3 points are Z1=Z¯, Z2=Z¯+SDZ, and Z3=Z¯SDZ, therefore we need to perform 3 t-tests.

#> [1] 1.429384
ABCDEFGHIJ0123456789
 
 
lhs
<chr>
op
<chr>
rhs
<chr>
label
<chr>
est
<dbl>
se
<dbl>
z
<dbl>
pvalue
<dbl>
ci.lower
<dbl>
15SSHigh:=b1+b3*1.429SSHigh1.43747810.16918468.4965070.000000e+001.1058824
16SSMod:=b1+b3*0SSMod0.55486510.10539375.2646921.404247e-070.3482973
17SSLow:=b1+b3*(-1.429)SSLow-0.32774780.1263960-2.5930239.513644e-03-0.5754795
  • Johoson-Neyman test

Johnson & Neyman () proposed another test for continuous moderator. Because Z is continuous, we have infinite simple slopes to test, each one correspond to a t upon which we calculate the p-value. Thus, we would like to know the range of Z, in which all possible Zs have significant ts.

From the aforementioned t-test formula, we have |β1^+β3^Zvar(β1^)+2Zcov(β1^,β3^)+Z2var(β3^)|>tctc2var(β1^)+2tc2Zcov(β1^,β3^)+tc2Z2var(β3^)<β12^+2β1^β3^Z+β32^Z2 where tc is the right-tail critical value of t distribution with given df. It is easy to see that the last equation above is effectively a second order inequality of Z and can be rewritten as

[tc2var(β3^)β32^]Z2+[2tc2cov(β1^,β3^)2β1^β3^]Z+[tc2var(β1^)β12^]<0.

Let’s denote tc2var(β3^)β32^=a, 2tc2cov(β1^,β3^)2β1^β3^=b, and tc2var(β1^)β12^=c, the roots of aZ2+bZ+c<0 are Z=b±b24ac2a. The Zs with significant simple slope are all in the intersection area (shaded) that between the graph of f(Z)=aZ2+bZ+c and y<0. It could be one area when a>0 or two areas when a<0.

For linear regression with fixed x, one can use the following code

library(interactions)
#> Warning: package 'interactions' was built under R version 4.3.3
res_lm <- lm(data = df_cont, formula = "Y~X*Z")
johnson_neyman(res_lm, pred = "X", modx = "Z", alpha = 0.05)
#> JOHNSON-NEYMAN INTERVAL
#> 
#> When Z is OUTSIDE the interval [-1.29, -0.56], the slope of X is p < .05.
#> 
#> Note: The range of observed values of Z is [-2.85, 3.10]

For SEM-based approach, one can extract tc and all required parameters estimate from the output of lavaan and manually calculate a, Z1 and Z2.

5.1.2.2 Categorical X

Suppose X is a 3-level categorical variable and is dummy coded, the 1st category is used as the reference. The basic moderation model becomes Y=β0+β1d2X+β2d3X+β3Z+β4d2XZ+β5d3XZ+ϵ=β0+β3Z+(β1+β4Z)d2X+(β2+β5Z)d3X+ϵ. When X=1, d2X=0=d3X=0, we have Y=β0+β3Z+ϵ. When X=2, d2X=1, d3X=0, we have Y=β0+β1+(β3+β4)Z+ϵ. When X=3, d2X=0, d3X=1, we have Y=β0+β2+(β3+β5)Z+ϵ. Assume that all coefficients are significant. It is easy to see that if we removed the terms containing Z, the predicted value of Y depends on X, implying that X has an effect on Y. After introducing Z, the predicted value of Y given X becomes a function of Z, i.e. the effect of X on Y is moderated by Z.

5.2 When moderator is categorical

5.2.1 Continuous X

When moderator is a 3-level categorical variable with the 1st category as reference we have Y=β0+β1X+β2d2Z+β3d3Z+β4Xd2Z+β5Xd3Z+ϵ, in the moderation above, there are 3 simple slopes to be tested,

  • for Z=1, we have Y=β0+β1X,
  • for Z=2, we have Y=β0+β2+(β1+β4)X.
  • for Z=3, we have Y=β0+β3+(β1+β5)X.

For example,

ABCDEFGHIJ0123456789
 
 
X
<dbl>
d_Z_1
<dbl>
d_Z_2
<dbl>
d_Z_3
<dbl>
Xd_Z_1
<dbl>
Xd_Z_2
<dbl>
Xd_Z_3
<dbl>
Y
<dbl>
1-0.56047565100-0.5604756500-0.4954800
2-0.23017749100-0.2301774900-0.3677777
31.558708311001.55870831000.3408155
40.070508391000.0705083900-0.5172591
50.129287741000.12928774000.1274843
61.715064991001.71506499001.6887117
ABCDEFGHIJ0123456789
 
 
lhs
<chr>
op
<chr>
rhs
<chr>
label
<chr>
est
<dbl>
se
<dbl>
z
<dbl>
pvalue
<dbl>
ci.lower
<dbl>
1Y~Xb10.44983830.10808044.1620693.153765e-050.2380045
2Y~d_Z_2b20.62316610.13959884.4639798.045144e-060.3495575
3Y~d_Z_3b30.48722010.13972823.4869134.886307e-040.2133578
4Y~Xd_Z_2b40.41651920.14862922.8024065.072307e-030.1252114
5Y~Xd_Z_3b50.56815810.14989653.7903361.504437e-040.2743663

The whole moderation model is Y^=0.468+0.450X+0.623dZ2+0.487dZ3+0.417XdZ2+0.568XdZ3, thus,

  • for Z=1, we have Y^=0.468+0.450X,
  • for Z=2, we have Y^=0.468+0.623+(0.450+0.417)X.
  • for Z=3, we have Y^=0.468+0.487+(0.450+0.568)X.
k <- 5
pars <- parameterestimates(res)
# simple slope when Z = 1
b1 <- pars$est[[1]]
b4 <- pars$est[[4]]
b5 <- pars$est[[5]]
var_b1 <- pars$se[[1]]^2
var_b4 <- pars$se[[4]]^2
var_b5 <- pars$se[[5]]^2
cov_b1b4 <- vcov(res)["b1", "b4"]
cov_b1b5 <- vcov(res)["b1", "b5"]
t_Z1 <- b1/sqrt(var_b1)
p_Z1 <- 2*min(
  pt(t_Z1, df = n - k - 1), 
  pt(t_Z1, df = n - k - 1, lower.tail = FALSE)
)
# simple slope when Z = 2
t_Z2 <- (b1 + b4)/sqrt(var_b1 + 2*cov_b1b4 + var_b4)
p_Z2 <- 2*min(
  pt(t_Z2, df = n - k - 1), 
  pt(t_Z2, df = n - k - 1, lower.tail = FALSE)
)
# simple slope when Z = 3
t_Z3 <- (b1 + b5)/sqrt(var_b1 + 2*cov_b1b5 + var_b5)
p_Z3 <- 2*min(
  pt(t_Z3, df = n - k - 1), 
  pt(t_Z3, df = n - k - 1, lower.tail = FALSE)
)
test_simple_slope <- data.frame(
  t = c(t_Z1, t_Z2, t_Z3),
  p = c(p_Z1, p_Z2, p_Z3)
)
print(test_simple_slope)
#>          t            p
#> 1 4.162069 4.146295e-05
#> 2 8.491557 1.026076e-15
#> 3 9.801312 8.262378e-20

To avoid calculate t manually for the rest 2 simple slopes, we can just switch the reference group.

# category 2 as reference
model_Z2 <- "
  Y ~ b1*X + b2*d_Z_1 + b3*d_Z_3 + b4*Xd_Z_1 + b5*Xd_Z_3
"
res_model_Z2 <- sem(
  model = model_Z2, 
  data = df_cate,
  meanstructure = TRUE
)
parameterestimates(res_model_Z2)[1:5, ]
ABCDEFGHIJ0123456789
 
 
lhs
<chr>
op
<chr>
rhs
<chr>
label
<chr>
est
<dbl>
se
<dbl>
z
<dbl>
pvalue
<dbl>
ci.lower
<dbl>
1Y~Xb10.86635750.10202588.49155710.000000e+000.6663907
2Y~d_Z_1b2-0.62316610.1395988-4.46397908.045144e-06-0.8967747
3Y~d_Z_3b3-0.13594600.1398174-0.97231123.308958e-01-0.4099830
4Y~Xd_Z_1b4-0.41651920.1486292-2.80240555.072307e-03-0.7078271
5Y~Xd_Z_3b50.15163880.14559131.04153742.976262e-01-0.1337149
# category 3 as reference
model_Z3 <- "
  Y ~ b1*X + b2*d_Z_1 + b3*d_Z_2 + b4*Xd_Z_1 + b5*Xd_Z_2
"
res_model_Z3 <- sem(
  model = model_Z3, 
  data = df_cate,
  meanstructure = TRUE
)
parameterestimates(res_model_Z3)[1:5, ]
ABCDEFGHIJ0123456789
 
 
lhs
<chr>
op
<chr>
rhs
<chr>
label
<chr>
est
<dbl>
se
<dbl>
z
<dbl>
pvalue
<dbl>
ci.lower
<dbl>
1Y~Xb11.01799630.10386339.80131200.00000000000.8144280
2Y~d_Z_1b2-0.48722010.1397282-3.48691280.0004886307-0.7610824
3Y~d_Z_2b30.13594600.13981740.97231120.3308957772-0.1380910
4Y~Xd_Z_1b4-0.56815810.1498965-3.79033590.0001504437-0.8619498
5Y~Xd_Z_2b5-0.15163880.1455913-1.04153740.2976261849-0.4369926

Manually visualize the moderation effect.

x <- range(df_cate$X)
x <- rep(x, times = 3)
y <- rep(0, times = 6)
y[1:2] <- 0.468 + x[1:2]*0.450
y[3:4] <- 0.468 + 0.628 + (0.450 + 0.417)*x[3:4]
y[5:6] <- 0.468 + 0.487 + (0.450 + 0.568)*x[5:6]
df_plot <- data.frame(
  x, y, z = rep(1:3, each = 2)
)
df_plot$z <- factor(df_plot$z)
ggplot(df_plot, aes(x = x, y = y, color = z)) + 
  geom_line()

If using linear regression with fixed x to fit a basic moderation model, the interactions package could be used to visualize interaction effect.

5.2.2 Categorical X

When X and Z are categorical, we just use ANOVA to conduct moderation analysis.

5.3 Typical procedures of moderation analysis

In summary, the typical procedures of moderation analysis are:

  1. center or standardize continuous independent variables, if any;
  2. dummy code categorical variables, if any;
  3. construct product term;
  4. test significance of β3;
  5. simple slope analysis and visualization.

5.4 Real data example

The following example is from Hayes () (p245). In this study (), 211 participants read a news story about a famine in Africa that was reportedly caused by severe droughts affecting the region. For half of the participants, the story attributed the droughts to the effects of climate change, whereas for the other half, the story provided no information suggesting that climate change was responsible for the droughts. I refer to these as the “climate change” and “natural causes” conditions, respectively. They are coded in a variable named FRAME in the data, which is set to 0 for those in the natural causes condition and 1 for those in the climate change condition.

After reading this story, the participants were asked a set of questions assessing how much they agreed or disagreed with various justifications for not providing aid to the victims, for example, that they did not deserve help, that the victims had themselves to blame for their situation, that the donations would not be helpful or effective, and so forth. Responses to these questions were aggregated and are held in a variable named JUSTIFY that quantifies the strength of a participant’s justifications for withholding aid. So higher scores on JUSTIFY reflect a stronger sense that helping out the victims was not justified. The participants also responded to a set of questions about their beliefs about whether climate change is a real phenomenon. This measure of climate change skepticism is named SKEPTIC in the data, and the higher a participant’s score, the more skeptical he or she is about the reality of climate change.

The purpose of this analysis is to examine whether framing the disaster as caused by climate change rather than leaving the cause unspecified influences people’s justifications for not helping, and also whether this effect of framing is dependent on a person’s skepticism about climate change.

  • X independent variable: FRAME, categorical, 0 natural case condition, 1 climate change condition
  • Y dependent variable: JUSTIFY, continuous
  • Z moderator: SKEPTIC, continuous

Y^=2.4520.562X+0.105Z+0.201XZ For participants in the natural case condition (X=0), {Y^=2.452+0.105×2=2.662if Z=2Y^=2.452+0.105×3.5=2.8195if Z=3.5Y^=2.452+0.105×5=2.977if Z=5, for participants in the climate change condition (x=1), {Y^=2.4520.562+0.105×2+0.201×2=2.502if Z=2Y^=2.4520.562+0.105×3.5+0.201×3.5=2.961if Z=3.5Y^=2.4520.562+0.105×5+0.201×5=3.420if Z=5, From these calculations, it appears that participants lower in climate change skepticism reported weaker justifications for withholding aid when told the drought was caused by climate change compared to when not so told. However, among those at the higher end of the continuum of climate change skepticism, the opposite is observed. Participants high in skepticism about climate change who read the story attributing the drought to climate change reported stronger justifications for withholding aid than those who read the story that did not attribute the drought to climate change.

5.5 How to report moderation analysis