## 26.12 Assumptions

**Parallel Trends**: Difference between the treatment and control groups remain constant if there were no treatment.should be used in cases where

you observe before and after an event

you have treatment and control groups

not in cases where

treatment is not random

confounders.

To support we use

**Linear additive effects**(of group/unit specific and time-specific):If they are not additively interact, we have to use the weighted 2FE estimator (Imai and Kim 2021)

Typically seen in the Staggered Dif-n-dif

No anticipation: There is no causal effect of the treatment before its implementation.

**Possible issues**

Estimate dependent on functional form:

- When the size of the response depends (nonlinearly) on the size of the intervention, we might want to look at the the difference in the group with high intensity vs. low.

Selection on (time–varying) unobservables

- Can use the overall sensitivity of coefficient estimates to hidden bias using Rosenbaum Bounds

Long-term effects

- Parallel trends are more likely to be observed over shorter period (window of observation)

Heterogeneous effects

- Different intensity (e.g., doses) for different groups.

Ashenfelter dip (Ashenfelter and Card 1985) (job training program participant are more likely to experience an earning drop prior enrolling in these programs)

- Participants are systemically different from nonparticipants before the treatment, leading to the question of permanent or transitory changes.
- A fix to this transient endogeneity is to calculate long-run differences (exclude a number of periods symmetrically around the adoption/ implementation date). If we see a sustained impact, then we have strong evidence for the causal impact of a policy. (Proserpio and Zervas 2017b) (James J. Heckman and Smith 1999) (Jepsen, Troske, and Coomes 2014) (X. Li, Gan, and Hu 2011)

Response to event might not be immediate (can’t be observed right away in the dependent variable)

- Using lagged dependent variable \(Y_{it-1}\) might be more appropriate (Blundell and Bond 1998)

Other factors that affect the difference in trends between the two groups (i.e., treatment and control) will bias your estimation.

Correlated observations within a group or time

Incidental parameters problems (Lancaster 2000): it’s always better to use individual and time fixed effect.

When examining the effects of variation in treatment timing, we have to be careful because negative weights (per group) can be negative if there is a heterogeneity in the treatment effects over time. Example: [Athey and Imbens (2022)](Borusyak, Jaravel, and Spiess 2021)(Goodman-Bacon 2021). In this case you should use new estimands proposed by @callaway2021difference(Clément De Chaisemartin and d’Haultfoeuille 2020), in the

`did`

package. If you expect lags and leads, see (L. Sun and Abraham 2021)(Gibbons, Suárez Serrato, and Urbancic 2018) caution when we suspect the treatment effect and treatment variance vary across groups

### 26.12.1 Prior Parallel Trends Test

- Plot the average outcomes over time for both treatment and control group before and after the treatment in time.
- Statistical test for difference in trends (
**using data from before the treatment period**)

\[ Y = \alpha_g + \beta_1 T + \beta_2 T\times G + \epsilon \]

where

\(Y\) = the outcome variable

\(\alpha_g\) = group fixed effects

\(T\) = time (e.g., specific year, or month)

\(\beta_2\) = different time trends for each group

Hence, if \(\beta_2 =0\) provides evidence that there are no differences in the trend for the two groups prior the time treatment.

You can also use different functional forms (e..g, polynomial or nonlinear).

If \(\beta_2 \neq 0\) statistically, possible reasons can be:

Statistical significance can be driven by large sample

Or the trends are so consistent, and just one period deviation can throw off the trends. Hence, statistical statistical significance.

Technically, we can still salvage the research by including time fixed effects, instead of just the before-and-after time fixed effect (actually, most researchers do this mechanically anyway nowadays). However, a side effect can be that the time fixed effects can also absorb some part your treatment effect as well, especially in cases where the treatment effects vary with time (i.e., stronger or weaker over time) (Wolfers 2003).

Debate:

(Kahn-Lang and Lang 2020) argue that DiD will be more plausible when the treatment and control groups are similar not only in

**trends**, but also in**levels**. Because when we observe dissimilar in levels prior to the treatment, why is it okay to think that this will not affect future trends?Show a plot of the dependent variable’s time series for treated and control groups and also a similar plot with matched sample. (Ryan et al. 2019) show evidence of matched DiD did well in the setting of non-parallel trends (at least in health care setting).

In the case that we don’t have similar levels ex ante between treatment and control groups, functional form assumptions matter and we need justification for our choice.

Pre-trend statistical tests: (Roth 2022) provides evidence that these test are usually under powered.

- See PretrendsPower and pretrends packages for correcting this.

```
library(tidyverse)
library(fixest)
od <- causaldata::organ_donations %>%
# Use only pre-treatment data
filter(Quarter_Num <= 3) %>%
# Treatment variable
dplyr::mutate(California = State == 'California')
# use my package
causalverse::plot_par_trends(
data = od,
metrics_and_names = list("Rate" = "Rate"),
treatment_status_var = "California",
time_var = list(Quarter_Num = "Time"),
display_CI = F
)
#> [[1]]
```

```
# do it manually
# always good but plot the dependent out
od |>
# group by treatment status and time
dplyr::group_by(California, Quarter) |>
dplyr::summarize_all(mean) |>
dplyr::ungroup() |>
# view()
ggplot2::ggplot(aes(x = Quarter_Num, y = Rate, color = California)) +
ggplot2::geom_line() +
causalverse::ama_theme()
```

```
# but it's also important to use statistical test
prior_trend <- fixest::feols(Rate ~ i(Quarter_Num, California) | State + Quarter,
data = od)
fixest::coefplot(prior_trend, grid = F)
```

This is alarming since one of the periods is significantly different from 0, which means that our parallel trends assumption is not plausible.

In cases where you might have violations of parallel trends assumption, check (Rambachan and Roth 2023)

Impose restrictions on how different the post-treatment violations of parallel trends can be from the pre-trends.

Partial identification of causal parameter

A type of sensitivity analysis

```
# https://github.com/asheshrambachan/HonestDiD
# remotes::install_github("asheshrambachan/HonestDiD")
# library(HonestDiD)
```

Alternatively, Ban and Kedagni (2022) propose a method that with an information set (i.e., pre-treatment covariates), and an assumption on the selection bias in the post-treatment period (i.e., lies within the convex hull of all selection biases), they can still identify a set of ATT, and with stricter assumption on selection bias from the policymakers perspective, they can also have a point estimate.

Alternatively, we can use the `pretrends`

package to examine our assumptions (Roth 2022)

### 26.12.2 Placebo Test

Procedure:

- Sample data only in the period before the treatment in time.
- Consider different fake cutoff in time, either
Try the whole sequence in time

Generate random treatment period, and use

**randomization inference**to account for sampling distribution of the fake effect.

- Estimate the DiD model but with the post-time = 1 with the fake cutoff
- A significant DiD coefficient means that you violate the parallel trends! You have a big problem.

Alternatively,

- When data have multiple control groups, drop the treated group, and assign another control group as a “fake” treated group. But even if it fails (i.e., you find a significant DiD effect) among the control groups, it can still be fine. However, this method is used under Synthetic Control

```
library(tidyverse)
library(fixest)
od <- causaldata::organ_donations %>%
# Use only pre-treatment data
dplyr::filter(Quarter_Num <= 3) %>%
# Create fake treatment variables
dplyr::mutate(
FakeTreat1 = State == 'California' &
Quarter %in% c('Q12011', 'Q22011'),
FakeTreat2 = State == 'California' &
Quarter == 'Q22011'
)
clfe1 <- fixest::feols(Rate ~ FakeTreat1 | State + Quarter,
data = od)
clfe2 <- fixest::feols(Rate ~ FakeTreat2 | State + Quarter,
data = od)
fixest::etable(clfe1,clfe2)
#> clfe1 clfe2
#> Dependent Var.: Rate Rate
#>
#> FakeTreat1TRUE 0.0061 (0.0051)
#> FakeTreat2TRUE -0.0017 (0.0028)
#> Fixed-Effects: --------------- ----------------
#> State Yes Yes
#> Quarter Yes Yes
#> _______________ _______________ ________________
#> S.E.: Clustered by: State by: State
#> Observations 81 81
#> R2 0.99377 0.99376
#> Within R2 0.00192 0.00015
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

We would like the “supposed” DiD to be insignificant.

**Robustness Check**

Placebo DiD (if the DiD estimate \(\neq 0\), parallel trend is violated, and original DiD is biased):

Group: Use fake treatment groups: A population that was

**not**affect by the treatmentTime: Redo the DiD analysis for period before the treatment (expected treatment effect is 0) (e..g, for previous year or period).

Possible alternative control group: Expected results should be similar

Try different windows (further away from the treatment point, other factors can creep in and nullify your effect).

Treatment Reversal (what if we don’t see the treatment event)

Higher-order polynomial time trend (to relax linearity assumption)

Test whether other dependent variables that should not be affected by the event are indeed unaffected.

- Use the same control and treatment period (DiD \(\neq0\), there is a problem)

### 26.12.3 Rosenbaum Bounds

Rosenbaum Bounds assess the overall sensitivity of coefficient estimates to hidden bias (Rosenbaum and Rosenbaum 2002) without having knowledge (e.g., direction) of the bias. This method is also known as **worst case analyses** (DiPrete and Gangl 2004).

Consider the treatment assignment is based in a way that the odds of treatment of a unit and its control is different by a multiplier \(\Gamma\) (where \(\Gamma = 1\) mean that the odds of assignment is identical, which mean random treatment assignment).

- This bias is the product of an unobservable that influences both treatment selection and outcome by a factor \(\Gamma\) (omitted variable bias)

Using this technique, we may estimate the upper limit of the p-value for the treatment effect while assuming selection on unobservables of magnitude \(\Gamma\).

Usually, we would create a table of different levels of \(\Gamma\) to assess how the magnitude of biases can affect our evidence of the treatment effect (estimate).

If we have treatment assignment is clustered (e.g., within school, within state) we need to adjust the bounds for clustered treatment assignment (Hansen, Rosenbaum, and Small 2014) (similar to clustered standard errors)

Then, we can report the minimum value of \(\Gamma\) at which the treatment treat is nullified (i.e., become insignificant). And the literature’s rules of thumb is that if \(\Gamma > 2\), then we have strong evidence for our treatment effect is robust to large biases (Proserpio and Zervas 2017a)

Packages

`rbounds`

(Keele 2010)`sensitivitymv`

(Rosenbaum 2015)`sensitivitymw`

(Rosenbaum 2015)

### References

*The Review of Economics and Statistics*67 (4): 648. https://doi.org/10.2307/1924810.

*Journal of Econometrics*226 (1): 62–79.

*arXiv Preprint arXiv:2211.06710*.

*Journal of Econometrics*87 (1): 115–43.

*arXiv Preprint arXiv:2108.12419*.

*American Economic Review*110 (9): 2964–96.

*Sociological Methodology*34 (1): 271–310.

*Journal of Econometric Methods*8 (1): 20170002.

*Journal of Econometrics*225 (2): 254–77.

*Journal of the American Statistical Association*109 (505): 133–44.

*The Economic Journal*109 (457): 313–48. https://doi.org/10.1111/1468-0297.00451.

*Political Analysis*29 (3): 405–15.

*Journal of Labor Economics*32 (1): 95–121. https://doi.org/10.1086/671809.

*Journal of Business & Economic Statistics*38 (3): 613–20.

*White Paper. Columbus, OH*1: 15.

*Journal of Econometrics*95 (2): 391–413.

*The Journal of Socio-Economics*40 (4): 404–11. https://doi.org/10.1016/j.socec.2011.04.012.

*Marketing Science*36 (5): 645–65.

*Marketing Science*36 (5): 645–65. https://doi.org/10.1287/mksc.2017.1043.

*Review of Economic Studies*, rdad018.

*Observational Studies*1 (2): 1–17.

*Overt Bias in Observational Studies*. Springer.

*American Economic Review*4 (3): 305–22.

*Statistical Methods in Medical Research*28 (12): 3697–3711.

*Journal of Econometrics*225 (2): 175–99.

*International Finance*6 (1): 1–26.