32.12 Bayesian Synthetic Control
The Bayesian Synthetic Control (BSC) approach introduces a probabilistic alternative to traditional synthetic control methods. Unlike the standard SCM, which estimates a single point estimate of treatment effects using a convex combination of control units, BSC incorporates posterior predictive distributions, allowing for proper uncertainty quantification and probabilistic inference.
Bayesian methods offer several advantages over frequentist SCM:
Probabilistic Treatment Effects: Instead of a single deterministic estimate, Bayesian SCM provides a distribution over possible treatment effects.
Regularization via Priors: Bayesian approaches allow for the incorporation of shrinkage priors to improve estimation stability in high-dimensional settings.
Flexibility: The Bayesian framework can accommodate dynamic latent factor models, addressing issues like time-varying heterogeneity.
Two major Bayesian approaches to SCM:
- Dynamic Multilevel Factor Models (Pang, Liu, and Xu 2022)
- Bayesian Sparse Synthetic Control (S. Kim, Lee, and Gupta 2020)
32.12.1 Bayesian Causal Inference Framework
In the traditional SCM, we estimate the counterfactual outcome \(Y_{it}(0)\) for treated unit \(i\) at time \(t\) using a weighted sum of control units:
\[ \hat{Y}_{it}(0) = \sum_{j \neq i} w_j Y_{jt}. \]
However, this deterministic approach does not quantify uncertainty in the estimation. The Bayesian SCM instead models the counterfactual outcome as a posterior predictive distribution:
\[ P(Y_{it}(0) | Y_{\text{obs}}, \theta), \]
where \(\theta\) represents the parameters of the model (e.g., factor loadings, regression coefficients, latent variables). The Bayesian approach estimates full posterior distributions, allowing us to compute credible intervals instead of relying solely on point estimates.
32.12.2 Bayesian Dynamic Multilevel Factor Model
The Dynamic Multilevel Factor Model (DM-LFM), proposed by Pang, Liu, and Xu (2022), extends SCM by incorporating latent factor models to correct for unit-specific time trends.
32.12.2.1 Model Specification
Let \(Y_{it}\) be the observed outcome for unit \(i\) at time \(t\). The potential untreated outcome follows:
\[ Y_{it}(0) = X_{it} \beta + \lambda_i' f_t + \varepsilon_{it}, \]
where:
\(X_{it}\) are observed covariates,
\(\beta\) are regression coefficients,
\(\lambda_i\) are unit-specific factor loadings,
\(f_t\) are common latent time factors,
\(\varepsilon_{it} \sim \mathcal{N}(0, \sigma^2)\) is the noise term.
The treatment effect is defined as:
\[ \tau_{it} = Y_{it}(1) - Y_{it}(0). \]
Under latent ignorability, we assume:
\[ P(T_i | X_i, U_i) = P(T_i | X_i), \]
where \(U_i\) are latent variables extracted from the outcome data.
32.12.2.2 Bayesian Inference Procedure
To estimate treatment effects, we follow these steps:
- Estimate \(\lambda_i\) and \(f_t\) using control units.
- Predict counterfactuals for treated units: \[ \hat{Y}_{it}(0) = X_{it} \hat{\beta} + \hat{\lambda}_i' \hat{f}_t. \]
- Obtain the posterior predictive distribution of treatment effects.
This Bayesian approach enables proper credible intervals for causal effects.
32.12.3 Bayesian Sparse Synthetic Control
S. Kim, Lee, and Gupta (2020) propose an alternative Bayesian framework that removes restrictive constraints imposed by standard SCM.
32.12.3.1 Relaxing SCM Constraints
Traditional SCM imposes:
- Nonnegative weights: \(w_j \geq 0\).
- Convex combination: \(\sum_j w_j = 1\).
BSCM relaxes these by allowing negative weights and regularization priors. It models the control unit weights using Bayesian shrinkage priors:
\[ w_j \sim \mathcal{N}(0, \tau^2), \]
where \(\tau^2\) is a regularization parameter. This allows flexible weight selection while preventing overfitting.
32.12.3.2 Bayesian Shrinkage Priors
BSCM incorporates horseshoe priors and spike-and-slab priors to select relevant control units:
Horseshoe Prior: \[ w_j \sim \mathcal{N}(0, \lambda_j^2), \quad \lambda_j \sim C^+(0,1). \]
Spike-and-Slab Prior: \[ w_j \sim \gamma_j \mathcal{N}(0, \sigma_1^2) + (1-\gamma_j) \mathcal{N}(0, \sigma_0^2), \] where \(\gamma_j \sim \text{Bernoulli}(\pi)\) determines whether a control unit is included.
These priors ensure robust weight selection while controlling for overfitting.
32.12.4 Bayesian Inference and MCMC Estimation
Both DM-LFM and BSCM are estimated using Markov Chain Monte Carlo (MCMC). Given observed data \(Y_{\text{obs}}\), we sample from the posterior:
\[ P(\theta | Y_{\text{obs}}) \propto P(Y_{\text{obs}} | \theta) P(\theta), \]
where \(P(\theta)\) encodes prior beliefs about the parameters.
Common MCMC techniques used:
Gibbs Sampling for latent factors and regression coefficients.
Hamiltonian Monte Carlo (HMC) for high-dimensional posteriors.
# Load necessary libraries
library(rstan)
library(bayesplot)
# Define Bayesian SCM model in Stan
scm_model <- "
data {
int<lower=0> N; // Number of observations
int<lower=0> T; // Time periods
matrix[N, T] Y; // Outcome matrix
}
parameters {
vector[T] f; // Latent factors
vector[N] lambda; // Factor loadings
real<lower=0> sigma; // Noise variance
}
model {
// Priors
f ~ normal(0, 1);
lambda ~ normal(0, 1);
// Likelihood
for (i in 1:N)
Y[i, ] ~ normal(lambda[i] * f, sigma);
}
"
# Compile and fit the model
fit <-
stan(model_code = scm_model, data = list(
N = 50,
T = 20,
Y = matrix(rnorm(1000), 50, 20)
))
# Summarize results
print(fit)