29.1 Understanding

To formally define the SDID estimator, we begin by considering a balanced panel dataset with $N$ units observed over $T$ time periods. Our goal is to estimate the causal effect of a treatment intervention while accounting for both unit-specific and time-specific confounders.

Let:

$Y_{it}$ be the outcome variable for unit $i$ at time $t$ .
$W_{it} \in \{0,1\}$ be a binary indicator of treatment, where $W_{it} = 1$ if unit $i$ is treated at time $t$ and $0$ otherwise.
The panel consists of:
- $N_c$ control units (never treated).
- $N_t$ treated units, exposed to treatment after period $T_{pre}$ .

29.1.1 Steps in SDID Estimation

SDID combines ideas from both SC and DID by introducing unit weights and time weights:

Find unit weights $\hat{w}_i^{sdid}$ to ensure that pre-treatment outcomes of the weighted control group match the pre-treatment outcomes of the treated units:
$\sum_{i = 1}^{N_c} \hat{w}_i^{sdid} Y_{it} \approx \frac{1}{N_t} \sum_{i = N_c + 1}^{N} Y_{it}, \quad \forall t = 1, \dots, T_{pre}$
This ensures that pre-treatment trends of treated and control units are similar, just as in SC.
Find time weights $\hat{\lambda}_t^{sdid}$ to balance post-treatment deviations from pre-treatment outcomes, stabilizing the inference.
Estimate the treatment effect $\hat{\tau}^{sdid}$ by solving the following minimization problem:
$(\hat{\tau}^{sdid}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \sum_{i=1}^{N} \sum_{t=1}^{T} (Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau)^2 \hat{w}_i^{sdid} \hat{\lambda}_t^{sdid}$
where:
- $\mu$ is the global intercept.
- $\alpha_i$ captures unit-specific fixed effects.
- $\beta_t$ captures time-specific fixed effects.
- $\tau$ represents the treatment effect.

Unlike the standard DID and SC estimators, SDID incorporates both unit and time weights, making it less sensitive to violations of parallel trends.

DID solves:
$(\hat{\tau}^{did}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \sum_{i=1}^{N} \sum_{t=1}^{T} (Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau)^2$
However, DID does not use unit or time weights, making it unreliable if treatment assignment correlates with unobserved factors.

SC minimizes:
$(\hat{\tau}^{sc}, \hat{\mu}, \hat{\beta}) = \arg \min_{\tau, \mu, \beta} \sum_{i=1}^{N} \sum_{t=1}^{T} (Y_{it} - \mu - \beta_t - W_{it} \tau)^2 \hat{w}_i^{sdid}$
SC does not include unit fixed effects ( $\alpha_i$ ) or time weights ( $\hat{\lambda}_t$ ), which can introduce bias when unmeasured confounders vary over time.

29.1.2 Comparison of Methods

The table below summarizes key differences between DID, SC, and SDID:

	DID	SC	SDID
Primary Assumption	Absence of intervention leads to parallel evolution across states.	Reweights unexposed states to match pre-intervention outcomes of treated state.	Reweights control units to ensure a parallel time trend with the treated pre-intervention trend.
Reliability Concern	Can be unreliable when pre-intervention trends aren’t parallel.	Accounts for non-parallel pre-intervention trends by reweighting.	Uses reweighting to adjust for non-parallel pre-intervention trends.
Treatment of Time Periods	All pre-treatment periods are given equal weight.	Doesn’t specifically emphasize equal weight for pre-treatment periods.	Focuses only on a subset of pre-intervention time periods, selected based on historical outcomes.
Goal with Reweighting	N/A (doesn’t use reweighting).	To match treated state as closely as possible before the intervention.	Make trends of control units parallel (not necessarily identical) to the treated pre-intervention.

An alternative formulation of the SDID treatment effect is:
$\hat{\tau} = \hat{\delta}_t - \sum_{i = 1}^{N_c} \hat{w}_i^{sdid} \hat{\delta}_i$
where:

$\hat{\delta}_t = \frac{1}{N_t} \sum_{i = N_c + 1}^{N} \hat{\delta}_i$ represents the average deviation in treated units post-treatment.
$\sum_{i = 1}^{N_c} \hat{w}_i^{sdid} \hat{\delta}_i$ adjusts for differences using weighted control unit deviations.

Method	Sample Weight	Adjusted outcomes ( $\hat{\delta}_i$ )	Interpretation
SC	$\hat{w}^{sc} = \min_{w \in R}l_{unit}(w)$	$\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it}$	Unweighted treatment period averages
DID	$\hat{w}_i^{did} = N_c^{-1}$	$\frac{1}{T_{post}} \sum_{t = T_{pre}+ 1}^T Y_{it} - \frac{1}{T_{pre}} \sum_{t = 1}^{T_{pre}}Y_{it}$	Unweighted differences between average treatment period and pretreatment outcome
SDID	$(\hat{w}_0, \hat{w}^{sdid}) = \min l_{unit}(w_0, w)$	$\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it} - \sum_{t = 1}^{T_{pre}} \hat{\lambda}_t^{sdid} Y_{it}$	Weighted differences between average treatment period and pretreatment outcome

A key innovation in Synthetic Difference-in-Differences is its use of unit and time weights to refine causal estimates. By applying weights, SDID essentially localizes a standard two-way fixed effects regression, making it more robust and precise.

29.1.3 Why Use Weights?

Unit weights ( $\hat{w}_i^{sdid}$ ) emphasize control units that are most similar to the treated units based on pre-treatment trends.
Time weights ( $\hat{\lambda}_t^{sdid}$ ) prioritize time periods most comparable to the post-treatment period.
This approach improves parallel trends validity without requiring a perfect match in raw data.

29.1.4 Benefits of Localization in SDID

Robustness
- By focusing on comparable units and time periods, SDID reduces bias from dissimilar observations.
Improved Precision
- SDID eliminates predictable variation in outcomes, reducing standard errors (SEs) compared to DID and SC
- SEs in SDID are smaller than those in DID and SC.
- Caveat: If outcome heterogeneity is minimal, unequal weighting might slightly reduce precision relative to standard DID.

29.1.5 Designing SDID Weights

29.1.5.1 Unit Weights: Balancing Pre-Treatment Trends

Unit weights $\hat{w}_i^{sdid}$ ensure that the weighted control group mimics the treated group’s pre-treatment trends, similar to SC but with greater flexibility:
$\sum_{i = 1}^{N_c} \hat{w}_i^{sdid} Y_{it} \approx \frac{1}{N_t} \sum_{i = N_c + 1}^{N} Y_{it}, \quad \forall t = 1, \dots, T_{pre}$
This helps achieve parallel pre-treatment trends rather than requiring an exact match in levels.

29.1.5.2 Time Weights: Stabilizing Post-Treatment Inference

Time weights $\hat{\lambda}_t^{sdid}$ ensure that post-treatment deviations are balanced relative to pre-treatment trends. This minimizes bias by down-weighting time periods that are vastly different from the post-treatment period.

Unlike unit weights, time weights do not require regularization because outcomes within the same time period are highly correlated across units.
Time weights improve the precision of SDID estimates by preventing certain periods from dominating the estimation process.

29.1.6 How SDID Enhances DID’s Plausibility

DID assumes parallel trends, but raw data often violates this assumption. SDID corrects for non-parallel trends by weighting both units and time periods.

Similar techniques have been used before to adjust DID assumptions, such as controlling for covariates or selecting specific time periods (Callaway and Sant’Anna 2021).
SDID automates this process, applying a systematic weighting approach for both units and time periods.

Including unit fixed effects ( $\alpha_i$ ) in SDID has two main advantages:

Flexibility:
- Allows for systematic differences across units while preserving parallel trends after reweighting.
Enhanced Precision:
- Explains a large fraction of the variation in outcomes, reducing noise and improving estimation accuracy.

Unit Fixed Effects and SC Weighting

Under ideal conditions, SC reweighting alone could account for unit fixed effects—if the weighted average of control unit outcomes perfectly matched the treated unit’s pre-treatment trajectory.
However, this rarely happens in reality, making unit fixed effects necessary for robust inference.
The use of fixed effects in synthetic control regressions (SC with intercept) was first proposed in Doudchenko and Imbens (2016) and Ferman and Pinto (2021), where it was referred to as DIFP (Difference-in-Fixed-Effects Prediction).

29.1.7 Choosing SDID Weights

Choosing Unit Weights

Regularization Parameter:

The penalty term is calibrated based on the typical one-period change in control unit outcomes during the pre-treatment period.
This value is then multiplied by a scaling factor (Arkhangelsky et al. 2021, 4092).

Relation to Synthetic Control Weights:

SDID weights resemble those used in Abadie, Diamond, and Hainmueller (2010) but have two key modifications:

Inclusion of an Intercept Term
- Unlike SC, SDID does not force the control pre-trends to exactly match the treated pre-trends—it only ensures they are parallel.
- This flexibility arises from unit fixed effects, which absorb any systematic level differences.
Regularization Penalty
- Borrowed from Doudchenko and Imbens (2016).
- Ensures dispersion of weights, preventing over-reliance on a few control units.
- This guarantees a unique solution for unit weights.

How SDID Compares to DID Weights:

DID weights are a special case of SC weights without an intercept or regularization penalty.
DID applies unit weights as in SC but only when there is one treated unit.

Choosing Time Weights

Like unit weights, time weights include an intercept term to account for overall time effects.
No regularization is applied to time weights because within-period correlations across units are expected.
This design allows SDID to minimize bias while stabilizing inference.

29.1.8 Accounting for Time-Varying Covariates in Weight Estimation

To further refine the estimation process, SDID can incorporate time-varying covariates by adjusting the outcome variable:
$Y_{it}^{res} = Y_{it} - X_{it} \hat{\beta}$
where $\hat{\beta}$ comes from the regression:
$Y_{it} = X_{it} \beta + \varepsilon_{it}$
This residualized outcome ( $Y_{it}^{res}$ ) ensures that the weighting process accounts for time-varying confounders, improving the validity of causal estimates.

References

Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. 2010. “Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program.” Journal of the American Statistical Association 105 (490): 493–505.

———. 2021. “Synthetic Difference-in-Differences.” American Economic Review 111 (12): 4088–118.

Callaway, Brantly, and Pedro HC Sant’Anna. 2021. “Difference-in-Differences with Multiple Time Periods.” Journal of Econometrics 225 (2): 200–230.

Doudchenko, Nikolay, and Guido W Imbens. 2016. “Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis.” National Bureau of Economic Research.

Ferman, Bruno, and Cristine Pinto. 2021. “Synthetic Controls with Imperfect Pretreatment Fit.” Quantitative Economics 12 (4): 1197–1221.