25.1 Understanding
Consider a traditional time-series cross-sectional data
Let \(Y_{it}\) denote the outcome for unit \(i\) in period \(t\)
A balanced panel of \(N\) units and \(T\) time periods
\(W_{it} \in \{0, 1\}\) is the binary treatment
\(N_c\) never-treated units (control)
\(N_t\) treated units after time \(T_{pre}\)
Steps:
- Find unit weights \(\hat{w}^{sdid}\) such that \(\sum_{i = 1}^{N_c} \hat{w}_i^{sdid} Y_{it} \approx N_t^{-1} \sum_{i = N_c + 1}^N Y_{it} \forall t = 1, \dots, T_{pre}\) (i.e., pre-treatment trends in outcome of the treated similar to those of control units) (similar to SC).
- Find time weights \(\hat{\lambda}_t\) such that we have a balanced window (i.e., posttreatment outcomes for control units differ consistently from their weighted average pretreatment outcomes).
- Estimate the average causal effect of treatment
\[ (\hat{\tau}^{sdid}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \alpha_i - \beta_ t - W_{it} \tau)^2 \hat{w}_i^{sdid} \hat{\lambda}_t^{sdid} \} \]
Better than DiD estimator because \(\tau^{did}\) does not consider time or unit weights
\[ (\hat{\tau}^{did}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \alpha_i - \beta_ t - W_{it} \tau)^2 \} \]
Better than SC estimator because \(\tau^{sc}\) lacks unit fixed effete and time weights
\[ (\hat{\tau}^{sc}, \hat{\mu}, \hat{\beta}) = \arg \min_{\tau, \mu, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \beta_ t - W_{it} \tau)^2 \hat{w}_i^{sdid} \} \]
DID | SC | SDID | |
---|---|---|---|
Primary Assumption | Absence of intervention leads to parallel evolution across states. | Reweights unexposed states to match pre-intervention outcomes of treated state. | Reweights control units to ensure a parallel time trend with the treated pre-intervention trend. |
Reliability Concern | Can be unreliable when pre-intervention trends aren’t parallel. | Accounts for non-parallel pre-intervention trends by reweighting. | Uses reweighting to adjust for non-parallel pre-intervention trends. |
Treatment of Time Periods | All pre-treatment periods are given equal weight. | Doesn’t specifically emphasize equal weight for pre-treatment periods. | Focuses only on a subset of pre-intervention time periods, selected based on historical outcomes. |
Goal with Reweighting | N/A (doesn’t use reweighting). | To match treated state as closely as possible before the intervention. | Make trends of control units parallel (not necessarily identical) to the treated pre-intervention. |
Alternatively, think of our parameter of interest as:
\[ \hat{\tau} = \hat{\delta}_t - \sum_{i = 1}^{N_c} \hat{w}_i \hat{\delta}_i \]
where \(\hat{\delta}_t = \frac{1}{N_t} \sum_{i = N_c + 1}^N \hat{\delta}_i\)
Method | Sample Weight | Adjusted outcomes (\(\hat{\delta}_i\)) | Interpretation |
---|---|---|---|
SC | \(\hat{w}^{sc} = \min_{w \in R}l_{unit}(w)\) | \(\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it}\) | Unweighted treatment period averages |
DID | \(\hat{w}_i^{did} = N_c^{-1}\) | \(\frac{1}{T_{post}} \sum_{t = T_{pre}+ 1}^T Y_{it} - \frac{1}{T_{pre}} \sum_{t = 1}^{T_{pre}}Y_{it}\) | Unweighted differences between average treatment period and pretreatment outcome |
SDID | \((\hat{w}_0, \hat{w}^{sdid}) = \min l_{unit}(w_0, w)\) | \(\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it} - \sum_{t = 1}^{T_{pre}} \hat{\lambda}_t^{sdid} Y_{it}\) | Weighted differences between average treatment period and pretreatment outcome |
The SDID estimator uses weights:
Makes two-way fixed effect regression “local.”
Emphasizes units similar in their past to treated units.
Prioritizes periods resembling treated periods.
Benefits of this localization:
Robustness: Using similar units and periods boosts estimator’s robustness.
Improved Precision: Weights can eliminate predictable outcome components.
The SEs of SDID are smaller than those of SC and DID
Caveat: If there’s minor systematic heterogeneity in outcomes, unequal weighting might reduce precision compared to standard DID.
Weight Design:
Unit Weights: Makes average outcome for treated units roughly parallel to the weighted average for control units.
Time Weights: Ensures posttreatment outcomes for control units differ consistently from their weighted average pretreatment outcomes.
Weights enhance DID’s plausibility:
Raw data often lacks parallel time trends for treated/control units.
Similar techniques (e.g., adjusting for covariates or selecting specific time periods) were used before (Callaway and Sant’Anna 2021).
SDID automates this process, applying a similar logic to weight both units and time periods.
Time Weights in SDID:
- Removes bias and boosts precision (i.e., minimizes the influence of time periods vastly different from posttreatment periods).
Argument for Unit Fixed Effects:
Flexibility: Increases model flexibility and thereby bolsters robustness.
Enhanced Precision: Unit fixed effects explain a significant portion of outcome variation.
SC Weighting & Unit Fixed Effects:
Under certain conditions, SC weighting can inherently account for unit fixed effects.
- For example, when the weighted average outcome for control units in pretreatment is the same as that of the treated units. (unlikely in reality)
The use of unit fixed effect in synthetic control regression (i.e., synthetic control with intercept) was proposed before in Doudchenko and Imbens (2016) and Ferman and Pinto (2021) (called DIFP)
More details on application
- Choose unit weights
Regularization Parameter:
- Equal to the size of a typical one-period outcome change for control units in the pre-period, then multiplied by a scaling factor (Arkhangelsky et al. 2021, 4092).
Relation to SC Weights:
SDID weights are similar to those used in (Abadie, Diamond, and Hainmueller 2010) except two distinctions:
Inclusion of an Intercept Term:
The weights in SynthDiD do not necessarily make the control pre-trends perfectly match the treated trends, just make them parallel.
This flexibility comes from the use of unit fixed effects, which can absorb any consistent differences between units.
Regularization Penalty:
Adopted from Doudchenko and Imbens (2016) .
Enhances the dispersion and ensures the uniqueness of the weights.
DID weights are identical to those used in (Abadie, Diamond, and Hainmueller 2010) without intercept and regularization penalty and 1 treated unit.
- Choose time weights
- Also include an intercept term, but no regularization (because correlated observations within time periods for the same unit is plausible, but not across units within the same period).
Note: To account for time-varying variables in the weights, one can use the residuals of the regression of the observed outcome on these time-varying variables, instead of the observed outcomes themselves (\(Y_{it}^{res} = Y_{it} - X_{it} \hat{\beta}\), where \(\hat{\beta}\) come from \(Y = \beta X_{it}\)).
The SDID method can account for systematic effects, often referred to as unit effects or unit heterogeneity, which influence treatment assignment (i.e., when treatment assignment is correlated with these systematic effects). Consequently, it provides unbiased estimates, especially valuable when there’s a suspicion that the treatment might be influenced by persistent, unit-specific attributes.
Even in cases where we have completely random assignment, SDID, DiD, and SC are unbiased, but SynthDiD has the smallest SE.