30.3 Simple Difference-in-Differences

Difference-in-Differences originated as a tool to analyze natural experiments, but its applications extend far beyond that. DID is built on the Fixed Effects Estimator, making it a fundamental approach for policy evaluation and causal inference in observational studies.

DID leverages inter-temporal variation between groups:

  • Cross-sectional comparison: Helps avoid omitted variable bias due to common trends.
  • Time-series comparison: Helps mitigate omitted variable bias due to cross-sectional heterogeneity.

30.3.1 Basic Setup of DID

Consider a simple setting with:

  • Treatment Group (\(D_i = 1\))
  • Control Group (\(D_i = 0\))
  • Pre-Treatment Period (\(T = 0\))
  • Post-Treatment Period (\(T = 1\))
After Treatment (\(T = 1\)) Before Treatment (\(T = 0\))
Treated (\(D_i = 1\)) \(E[Y_{1i}(1)|D_i = 1]\) \(E[Y_{0i}(0)|D_i = 1]\)
Control (\(D_i = 0\)) \(E[Y_{0i}(1)|D_i = 0]\) \(E[Y_{0i}(0)|D_i = 0]\)

The fundamental challenge: We cannot observe \(E[Y_{0i}(1)|D_i = 1]\)—i.e., the counterfactual outcome for the treated group had they not received treatment.


DID estimates the Average Treatment Effect on the Treated using the following formula:

\[ \begin{aligned} E[Y_1(1) - Y_0(1) | D = 1] &= \{E[Y(1)|D = 1] - E[Y(1)|D = 0] \} \\ &- \{E[Y(0)|D = 1] - E[Y(0)|D = 0] \} \end{aligned} \]

This formulation differences out time-invariant unobserved factors, assuming the parallel trends assumption holds.

  • For the treated group, we isolate the difference between being treated and not being treated.
  • If the control group would have experienced a different trajectory, the DID estimate may be biased.
  • Since we cannot observe treatment variation in the control group, we cannot infer the treatment effect for this group.
# Load required libraries
library(dplyr)
library(ggplot2)
set.seed(1)

# Simulated dataset for illustration
data <- data.frame(
  time = rep(c(0, 1), each = 50),  # Pre (0) and Post (1)
  treated = rep(c(0, 1), times = 50), # Control (0) and Treated (1)
  error = rnorm(100)
)

# Generate outcome variable
data$outcome <-
    5 + 3 * data$treated + 2 * data$time + 
    4 * data$treated * data$time + data$error

# Compute averages for 2x2 table
table_means <- data %>%
  group_by(treated, time) %>%
  summarize(mean_outcome = mean(outcome), .groups = "drop") %>%
  mutate(
    group = paste0(ifelse(treated == 1, "Treated", "Control"), ", ", 
                   ifelse(time == 1, "Post", "Pre"))
  )

# Display the 2x2 table
table_2x2 <- table_means %>%
  select(group, mean_outcome) %>%
  tidyr::spread(key = group, value = mean_outcome)

print("2x2 Table of Mean Outcomes:")
#> [1] "2x2 Table of Mean Outcomes:"
print(table_2x2)
#> # A tibble: 1 × 4
#>   `Control, Post` `Control, Pre` `Treated, Post` `Treated, Pre`
#>             <dbl>          <dbl>           <dbl>          <dbl>
#> 1            7.19           5.20            14.0           8.00

# Calculate Diff-in-Diff manually

# Treated, Post
Y11 <- table_means$mean_outcome[table_means$group == "Treated, Post"]  

# Treated, Pre
Y10 <- table_means$mean_outcome[table_means$group == "Treated, Pre"]   

# Control, Post
Y01 <- table_means$mean_outcome[table_means$group == "Control, Post"]  

# Control, Pre
Y00 <- table_means$mean_outcome[table_means$group == "Control, Pre"]   

diff_in_diff_formula <- (Y11 - Y10) - (Y01 - Y00)

# Estimate DID using OLS
model <- lm(outcome ~ treated * time, data = data)
ols_estimate <- coef(model)["treated:time"]

# Print results
results <- data.frame(
  Method = c("Diff-in-Diff Formula", "OLS Estimate"),
  Estimate = c(diff_in_diff_formula, ols_estimate)
)

print("Comparison of DID Estimates:")
#> [1] "Comparison of DID Estimates:"
print(results)
#>                            Method Estimate
#>              Diff-in-Diff Formula 4.035895
#> treated:time         OLS Estimate 4.035895

# Visualization
ggplot(data,
       aes(
           x = as.factor(time),
           y = outcome,
           color = as.factor(treated),
           group = treated
       )) +
    stat_summary(fun = mean, geom = "point", size = 3) +
    stat_summary(fun = mean,
                 geom = "line",
                 linetype = "dashed") +
    labs(
        title = "Difference-in-Differences Visualization",
        x = "Time (0 = Pre, 1 = Post)",
        y = "Outcome",
        color = "Group"
    ) +
    scale_color_manual(labels = c("Control", "Treated"),
                       values = c("blue", "red")) +
    causalverse::ama_theme()

Control (0) Treated (1)
Pre (0) \(\bar{Y}_{00} = 5\) \(\bar{Y}_{10} = 8\)
Post (1) \(\bar{Y}_{01} = 7\) \(\bar{Y}_{11} = 14\)

The table organizes the mean outcomes into four cells:

  1. Control Group, Pre-period (\(\bar{Y}_{00}\)): Mean outcome for the control group before the intervention.

  2. Control Group, Post-period (\(\bar{Y}_{01}\)): Mean outcome for the control group after the intervention.

  3. Treated Group, Pre-period (\(\bar{Y}_{10}\)): Mean outcome for the treated group before the intervention.

  4. Treated Group, Post-period (\(\bar{Y}_{11}\)): Mean outcome for the treated group after the intervention.

The DID treatment effect calculated from the simple formula of averages is identical to the estimate from an OLS regression with an interaction term.

The treatment effect is calculated as:

\(\text{DID} = (\bar{Y}_{11} - \bar{Y}_{10}) - (\bar{Y}_{01} - \bar{Y}_{00})\)

Compute manually:

\((\bar{Y}_{11} - \bar{Y}_{10}) - (\bar{Y}_{01} - \bar{Y}_{00})\)

Use OLS regression:

\(Y_{it} = \beta_0 + \beta_1 \text{treated}_i + \beta_2 \text{time}_t + \beta_3 (\text{treated}_i \cdot \text{time}_t) + \epsilon_{it}\)

Using the simulated table:

\(\text{DID} = (14 - 8) - (7 - 5) = 6 - 2 = 4\)

This matches the interaction term coefficient (\(\beta_3 = 4\)) from the OLS regression.

Both methods give the same result!


30.3.2 Extensions of DID

30.3.2.1 DID with More Than Two Groups or Time Periods

DID can be extended to multiple treatments, multiple controls, and more than two periods:

\[ Y_{igt} = \alpha_g + \gamma_t + \beta I_{gt} + \delta X_{igt} + \epsilon_{igt} \]

where:

  • \(\alpha_g\) = Group-Specific Fixed Effects (e.g., firm, region).

  • \(\gamma_t\) = Time-Specific Fixed Effects (e.g., year, quarter).

  • \(\beta\) = DID Effect.

  • \(I_{gt}\) = Interaction Terms (Treatment × Post-Treatment).

  • \(\delta X_{igt}\) = Additional Covariates.

This is known as the Two-Way Fixed Effects DID model. However, TWFE performs poorly under staggered treatment adoption, where different groups receive treatment at different times.


30.3.2.2 Examining Long-Term Effects (Dynamic DID)

To examine the dynamic treatment effects (that are not under rollout/staggered design), we can create a centered time variable.

Centered Time Variable Interpretation
\(t = -2\) Two periods before treatment
\(t = -1\) One period before treatment
\(t = 0\)

Last pre-treatment period right before treatment period

(Baseline/Reference Group)

\(t = 1\) Treatment period
\(t = 2\) One period after treatment

Dynamic Treatment Model Specification

By interacting this factor variable, we can examine the dynamic effect of treatment (i.e., whether it’s fading or intensifying):

\[ \begin{aligned} Y &= \alpha_0 + \alpha_1 Group + \alpha_2 Time \\ &+ \beta_{-T_1} Treatment + \beta_{-(T_1 -1)} Treatment + \dots + \beta_{-1} Treatment \\ &+ \beta_1 + \dots + \beta_{T_2} Treatment \end{aligned} \]

where:

  • \(\beta_0\) (Baseline Period) is the reference group (i.e., drop from the model).

  • \(T_1\) = Pre-Treatment Period.

  • \(T_2\) = Post-Treatment Period.

  • Treatment coefficients (\(\beta_t\)) measure the effect over time.

Key Observations:

  • Pre-treatment coefficients should be close to zero (\(\beta_{-T_1}, \dots, \beta_{-1} \approx 0\)), ensuring no pre-trend bias.

  • Post-treatment coefficients should be significantly different from zero (\(\beta_1, \dots, \beta_{T_2} \neq 0\)), measuring the treatment effect over time.

  • Higher standard errors with more interactions: Including too many lags can reduce precision.


30.3.2.3 DID on Relationships, Not Just Levels

DID can also be applied to relationships between variables rather than just outcome levels.

For example, DID can be used to estimate treatment effects on regression coefficients by comparing relationships before and after a policy change.


30.3.3 Goals of DID

  1. Pre-Treatment Coefficients Should Be Insignificant
    • Ensure that \(\beta_{-T_1}, \dots, \beta_{-1} = 0\) (similar to a Placebo Test).
  2. Post-Treatment Coefficients Should Be Significant
    • Verify that \(\beta_1, \dots, \beta_{T_2} \neq 0\).
    • Examine whether the trend in post-treatment coefficients is increasing or decreasing over time.

library(tidyverse)
library(fixest)

od <- causaldata::organ_donations %>%
    
    # Treatment variable
    dplyr::mutate(California = State == 'California') %>%
    # centered time variable
    dplyr::mutate(center_time = as.factor(Quarter_Num - 3))  
# where 3 is the reference period precedes the treatment period

class(od$California)
#> [1] "logical"
class(od$State)
#> [1] "character"

cali <- feols(Rate ~ i(center_time, California, ref = 0) |
                  State + center_time,
              data = od)

etable(cali)
#>                                              cali
#> Dependent Var.:                              Rate
#>                                                  
#> California x center_time = -2    -0.0029 (0.0051)
#> California x center_time = -1   0.0063** (0.0023)
#> California x center_time = 1  -0.0216*** (0.0050)
#> California x center_time = 2  -0.0203*** (0.0045)
#> California x center_time = 3    -0.0222* (0.0100)
#> Fixed-Effects:                -------------------
#> State                                         Yes
#> center_time                                   Yes
#> _____________________________ ___________________
#> S.E.: Clustered                         by: State
#> Observations                                  162
#> R2                                        0.97934
#> Within R2                                 0.00979
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

iplot(cali, pt.join = T)

coefplot(cali)