32.13 Using Multiple Outcomes to Improve the Synthetic Control Method
Typically, SCM constructs a weighted combination of untreated control units to approximate the counterfactual outcome of the treated unit. However, standard SCM is limited to a single outcome variable, which can lead to biased estimates when multiple correlated outcomes are available.
In their work, L. Sun, Ben-Michael, and Feller (2023) propose a novel extension of SCM that leverages multiple outcome variables to improve causal inference by:
- Using a common set of synthetic control weights across all outcomes rather than estimating separate weights for each outcome.
- Reducing bias using a low-rank factor model, which exploits shared latent structures across outcomes.
32.13.1 Standard Synthetic Control Method
Let \(Y_{itk}\) denote the observed outcome for unit \(i\) at time \(t\) for outcome \(k\), where \(i = 1, \dots, N\), \(t = 1, \dots, T\), and \(k = 1, \dots, K\). The potential outcomes framework assumes:
\[ Y_{itk}(d) = \mu_{itk} + \delta_{itk} d + \varepsilon_{itk}, \quad d \in \{0,1\} \]
where:
\(\mu_{itk}\) represents the latent structure of untreated outcomes.
\(\delta_{itk}\) is the treatment effect.
\(\varepsilon_{itk} \sim \mathcal{N}(0, \sigma^2)\) is random noise.
For unit \(i=1\) (the treated unit), the observed outcome follows:
\[ Y_{1tk} = Y_{1tk}(0) + D_{1t} \delta_{1tk} \]
where \(D_{1t}\) is an indicator for treatment at time \(t\). The challenge is to estimate the counterfactual outcome \(Y_{1tk}(0)\), which is unobserved post-treatment.
SCM estimates \(Y_{1tk}(0)\) as a weighted combination of control units:
\[ \hat{Y}_{1tk}(0) = \sum_{i=2}^{N} w_i Y_{itk} \]
where weights \(w_i\) are chosen to minimize pre-treatment imbalance.
32.13.2 Using Multiple Outcomes for Bias Reduction
Instead of estimating separate weights \(w_k\) for each outcome \(k\), L. Sun, Ben-Michael, and Feller (2023) propose a single set of weights \(w\) across all outcomes. This approach is justified under a low-rank factor model, which assumes that multiple outcomes share common latent factors.
32.13.2.1 Low-Rank Factor Model
Assume the untreated potential outcome follows a linear factor structure:
\[ Y_{itk}(0) = X_{it} \beta_k + \lambda_i' f_{tk} + \varepsilon_{itk} \]
where:
\(X_{it}\) are observed covariates.
\(\beta_k\) are outcome-specific coefficients.
\(\lambda_i\) are unit-specific factor loadings.
\(f_{tk}\) are time-and-outcome-specific latent factors.
If all outcomes share the same latent factor structure, then the bias in synthetic control estimation can be reduced by a factor of \(1 / \sqrt{K}\) as the number of outcomes \(K\) increases.
32.13.3 Estimation Methods
L. Sun, Ben-Michael, and Feller (2023) propose two methods for constructing a common synthetic control:
Concatenated Outcome Weights: Estimate weights by minimizing imbalance across all outcomes simultaneously:
\[ \hat{w} = \arg\min_w \sum_{k=1}^{K} || Y_{1,\text{pre},k} - \sum_{i=2}^{N} w_i Y_{i,\text{pre},k} ||^2 \]
Averaged Outcome Weights: Estimate weights based on a linear combination (e.g., average) of outcomes:
\[ \hat{w} = \arg\min_w || \frac{1}{K} \sum_{k=1}^{K} Y_{1,\text{pre},k} - \sum_{i=2}^{N} w_i \frac{1}{K} \sum_{k=1}^{K} Y_{i,\text{pre},k} ||^2 \]
These methods improve SCM performance by reducing variance and overfitting to noise.
32.13.4 Empirical Application: Flint Water Crisis
To illustrate the benefits of multiple outcome SCM, L. Sun, Ben-Michael, and Feller (2023) re-analyze the Flint water crisis, which led to lead contamination in drinking water, potentially affecting student performance.
Four key educational outcomes were studied:
- Math Achievement
- Reading Achievement
- Special Needs Status
- Daily Attendance
By applying common weights across these outcomes, their SCM results showed:
Reduced bias and improved robustness compared to separate SCM fits.
Better pre-treatment fit for educational outcomes.
Stronger evidence of educational impacts following the crisis.