is outdated because of step 1, but we could still see the original idea.

3 regressions

• Step 1: $$X \to Y$$

• Step 2: $$X \to M$$

• Step 3: $$X + M \to Y$$

where

• $$X$$ = independent variable

• $$Y$$ = dependent variable

• $$M$$ = mediating variable

1. Originally, the first path from $$X \to Y$$ suggested by needs to be significant. But there are cases that you could have indirect of $$X$$ on $$Y$$ without significant direct effect of $$X$$ on $$Y$$ (e.g., when the effect is absorbed into M, or there are two counteracting effects $$M_1, M_2$$ that cancel out each other effect).

Mathematically,

$Y = b_0 + b_1 X + \epsilon$

$$b_1$$ does not need to be significant.

1. We examine the effect of $$X$$ on $$M$$. This step requires that there is a significant effect of $$X$$ on $$M$$ to continue with the analysis

Mathematically,

$M = b_0 + b_2 X + \epsilon$

where $$b_2$$ needs to be significant.

1. In this step, we want to the effect of $$M$$ on $$Y$$ “absorbs” most of the direct effect of $$X$$ on $$Y$$ (or at least makes the effect smaller).

Mathematically,

$Y = b_0 + b_4 X + b_3 M + \epsilon$

$$b_4$$ needs to be either smaller or insignificant.

The effect of $$X$$ on $$Y$$ then, $$M$$ … mediates between $$X$$ and $$Y$$
completely disappear ($$b_4$$ insignificant) Fully (i.e., full mediation)
partially disappear ($$b_4$$ smaller than in step 1) Partially (i.e., partial mediation)
1. Examine the mediation effect (i.e., whether it is significant)
• Fist approach: Sobel’s test

• Second approach: bootstrapping (preferable)

More details can be found here

### 34.1.1 Example 1

myData <-

# Step 1 (no longer necessary)
model.0 <- lm(Y ~ X, myData)

# Step 2
model.M <- lm(M ~ X, myData)

# Step 3
model.Y <- lm(Y ~ X + M, myData)

# Step 4 (boostrapping)
library(mediation)
results <- mediate(
model.M,
model.Y,
treat = 'X',
mediator = 'M',
boot = TRUE,
sims = 500
)
summary(results)
#>
#> Causal Mediation Analysis
#>
#> Nonparametric Bootstrap Confidence Intervals with the Percentile Method
#>
#>                Estimate 95% CI Lower 95% CI Upper p-value
#> ACME             0.3565       0.2153         0.53  <2e-16 ***
#> ADE              0.0396      -0.1993         0.32    0.71
#> Total Effect     0.3961       0.1838         0.68  <2e-16 ***
#> Prop. Mediated   0.9000       0.4781         1.86  <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Sample Size Used: 100
#>
#>
#> Simulations: 500
• Total Effect = 0.3961 = $$b_1$$ (step 1) = total effect of $$X$$ on $$Y$$ without $$M$$

• Direct Effect = ADE = 0.0396 = $$b_4$$ (step 3) = direct effect of $$X$$ on $$Y$$ accounting for the indirect effect of $$M$$

• ACME = Average Causal Mediation Effects = $$b_1 - b_4$$ = 0.3961 - 0.0396 = 0.3565 = $$b_2 \times b_3$$ = 0.56102 * 0.6355 = 0.3565

Using mediation package suggested by . More on details of the package can be found here

2 types of Inference in this package:

1. Model-based inference:

• Assumptions:

• Treatment is randomized (could use matching methods to achieve this).

• Sequential Ignorability: conditional on covariates, there is other confounders that affect the relationship between (1) treatment-mediator, (2) treatment-outcome, (3) mediator-outcome. Typically hard to argue in observational data. This assumption is for the identification of ACME (i.e., average causal mediation effects).

2. Design-based inference

Notations: we stay consistent with package instruction

• $$M_i(t)$$ = mediator

• $$T_i$$ = treatment status $$(0,1)$$

• $$Y_i(t,m)$$ = outcome where $$t$$ = treatment, and $$m$$ = mediating variables.

• $$X_i$$ = vector of observed pre-treatment confounders

• Treatment effect (per unit $$i$$) = $$\tau_i = Y_i(1,M_i(1)) - Y_i (0,M_i(0))$$ which has 2 effects

• Causal mediation effects: $$\delta_i (t) \equiv Y_i (t,M_i(1)) - Y_i(t,M_i(0))$$

• Direct effects: $$\zeta (t) \equiv Y_i (1, M_i(1)) - Y_i(0, M_i(0))$$

• summing up to the treatment effect: $$\tau_i = \delta_i (t) + \zeta_i (1-t)$$

More on sequential ignorability

$\{ Y_i (t', m) , M_i (t) \} \perp T_i |X_i = x$

$Y_i(t',m) \perp M_i(t) | T_i = t, X_i = x$

where

• $$0 < P(T_i = t | X_i = x)$$

• $$0 < P(M_i = m | T_i = t , X_i =x)$$

First condition is the standard strong ignorability condition where treatment assignment is random conditional on pre-treatment confounders.

Second condition is stronger where the mediators is also random given the observed treatment and pre-treatment confounders. This condition is satisfied only when there is no unobserved pre-treatment confounders, and post-treatment confounders, and multiple mediators that are correlated.

My understanding is that until the moment I write this note, there is no way to test the sequential ignorability assumption. Hence, researchers can only do sensitivity analysis to argue for their result.

### References

Baron, Reuben M, and David A Kenny. 1986. “The Moderator–Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations.” Journal of Personality and Social Psychology 51 (6): 1173.
Imai, Kosuke, Luke Keele, and Dustin Tingley. 2010. “A General Approach to Causal Mediation Analysis.” Psychological Methods 15 (4): 309.
Imai, Kosuke, Luke Keele, and Teppei Yamamoto. 2010. “Identification, Inference and Sensitivity Analysis for Causal Mediation Effects.”
Preacher, Kristopher J, and Andrew F Hayes. 2004. “SPSS and SAS Procedures for Estimating Indirect Effects in Simple Mediation Models.” Behavior Research Methods, Instruments, & Computers 36: 717–31.
Sobel, Michael E. 1982. “Asymptotic Confidence Intervals for Indirect Effects in Structural Equation Models.” Sociological Methodology 13: 290–312.