32.13 Using Multiple Outcomes to Improve the Synthetic Control Method
Typically, SCM constructs a weighted combination of untreated control units to approximate the counterfactual outcome of the treated unit. However, standard SCM is limited to a single outcome variable, which can lead to biased estimates when multiple correlated outcomes are available.
In their work, L. Sun, Ben-Michael, and Feller (2023) propose a novel extension of SCM that leverages multiple outcome variables to improve causal inference by:
- Using a common set of synthetic control weights across all outcomes rather than estimating separate weights for each outcome.
- Reducing bias using a low-rank factor model, which exploits shared latent structures across outcomes.
32.13.1 Standard Synthetic Control Method
Let Yitk denote the observed outcome for unit i at time t for outcome k, where i=1,…,N, t=1,…,T, and k=1,…,K. The potential outcomes framework assumes:
Yitk(d)=μitk+δitkd+εitk,d∈{0,1}
where:
μitk represents the latent structure of untreated outcomes.
δitk is the treatment effect.
εitk∼N(0,σ2) is random noise.
For unit i=1 (the treated unit), the observed outcome follows:
Y1tk=Y1tk(0)+D1tδ1tk
where D1t is an indicator for treatment at time t. The challenge is to estimate the counterfactual outcome Y1tk(0), which is unobserved post-treatment.
SCM estimates Y1tk(0) as a weighted combination of control units:
ˆY1tk(0)=N∑i=2wiYitk
where weights wi are chosen to minimize pre-treatment imbalance.
32.13.2 Using Multiple Outcomes for Bias Reduction
Instead of estimating separate weights wk for each outcome k, L. Sun, Ben-Michael, and Feller (2023) propose a single set of weights w across all outcomes. This approach is justified under a low-rank factor model, which assumes that multiple outcomes share common latent factors.
32.13.2.1 Low-Rank Factor Model
Assume the untreated potential outcome follows a linear factor structure:
Yitk(0)=Xitβk+λ′iftk+εitk
where:
Xit are observed covariates.
βk are outcome-specific coefficients.
λi are unit-specific factor loadings.
ftk are time-and-outcome-specific latent factors.
If all outcomes share the same latent factor structure, then the bias in synthetic control estimation can be reduced by a factor of 1/√K as the number of outcomes K increases.
32.13.3 Estimation Methods
L. Sun, Ben-Michael, and Feller (2023) propose two methods for constructing a common synthetic control:
Concatenated Outcome Weights: Estimate weights by minimizing imbalance across all outcomes simultaneously:
ˆw=argmin
Averaged Outcome Weights: Estimate weights based on a linear combination (e.g., average) of outcomes:
\hat{w} = \arg\min_w || \frac{1}{K} \sum_{k=1}^{K} Y_{1,\text{pre},k} - \sum_{i=2}^{N} w_i \frac{1}{K} \sum_{k=1}^{K} Y_{i,\text{pre},k} ||^2
These methods improve SCM performance by reducing variance and overfitting to noise.
32.13.4 Empirical Application: Flint Water Crisis
To illustrate the benefits of multiple outcome SCM, L. Sun, Ben-Michael, and Feller (2023) re-analyze the Flint water crisis, which led to lead contamination in drinking water, potentially affecting student performance.
Four key educational outcomes were studied:
- Math Achievement
- Reading Achievement
- Special Needs Status
- Daily Attendance
By applying common weights across these outcomes, their SCM results showed:
Reduced bias and improved robustness compared to separate SCM fits.
Better pre-treatment fit for educational outcomes.
Stronger evidence of educational impacts following the crisis.