21.6 Experimental vs. Quasi-Experimental Designs
Experimental and quasi-experimental designs differ in their approach to causal inference. The table below summarizes key distinctions:
Experimental Design | Quasi-Experimental Design |
---|---|
Conducted by an experimentalist | Conducted by an observationalist |
Uses experimental data | Uses observational data |
Random assignment reduces treatment imbalance | Random sampling reduces sample selection error |
21.6.1 Criticisms of Quasi-Experimental Designs
Quasi-experimental methods do not always approximate experimental results accurately. For instance, LaLonde (1986) demonstrates that commonly used methods such as:
- Matching Methods
- Difference-in-differences
- [Tobit-2] (Heckman-type models)
often fail to replicate experimental estimates reliably. This finding cast serious doubt on the credibility of observational studies for estimating causal effects, igniting an ongoing debate in econometrics and statistics about the reliability of nonexperimental evaluations.
LaLonde’s critical assessment served as a catalyst for significant methodological and practical advancements in causal inference. In the decades since this publication, the field has evolved considerably, introducing both theoretical innovations and empirical practices aimed at addressing the limitations that were exposed (G. Imbens and Xu 2024). Among these advances are:
Emphasis on estimators based on unconfoundedness (selection on observables):
Modern causal inference frameworks frequently adopt the unconfoundedness or conditional independence assumption. Under this premise, treatment assignment is assumed to be independent of potential outcomes, conditional on observed covariates. This theoretical foundation underpins many widely used estimation techniques, such as matching methods, inverse probability weighting, and regression adjustment.Focus on covariate overlap (common support):
Researchers now recognize the critical importance of overlap, also referred to as common support, in the distributions of covariates across treatment and control groups. Without sufficient overlap, comparisons between treated and untreated units rely on extrapolation, which weakens causal claims. Modern methods explicitly assess and often impose restrictions to ensure overlap before proceeding with estimation.Introduction of propensity score-based methods and doubly robust estimators:
The introduction of propensity score methods (Rosenbaum and Rubin 1983) was a breakthrough, offering a way to reduce the dimensionality of the covariate space while balancing observed characteristics across groups. More recently, doubly robust estimators have emerged, combining propensity score weighting with outcome regression. These estimators provide consistent treatment effect estimates if either the propensity score model or the outcome model is correctly specified, offering greater robustness in practice.Greater emphasis on validation exercises to bolster credibility:
Modern studies increasingly incorporate validation techniques to evaluate the credibility of their findings. Placebo tests, falsification exercises, and sensitivity analyses are commonly employed to assess whether estimated effects may be driven by unobserved confounding or model misspecification. Such practices go beyond traditional goodness-of-fit statistics, directly interrogating the assumptions underlying causal inference.Methods for estimating and exploiting treatment effect heterogeneity:
Beyond estimating average treatment effects, contemporary research frequently explores heterogeneous treatment effects. These methods identify subgroups that may experience different causal impacts, which is of particular relevance in fields like personalized marketing, targeted interventions, and policy design.
To illustrate the practical lessons from these methodological advances, G. Imbens and Xu (2024) reexamine two canonical datasets:
- LaLonde’s National Supported Work Demonstration data
- The Imbens-Rubin-Sacerdote draft lottery data
Applying modern causal inference methods to these datasets demonstrates that, when sufficient covariate overlap exists, robust estimates of the adjusted differences between treatment and control groups can be achieved. However, it is critical to underscore that robustness in estimation does not equate to validity. Without direct validation exercises, such as placebo tests, even well-behaved estimates may be misleading.
G. Imbens and Xu (2024) highlight several key lessons for practitioners working with nonexperimental data to estimate causal effects:
Careful examination of the assignment process is essential.
Understanding the mechanisms by which units are assigned to treatment or control conditions informs the plausibility of the unconfoundedness assumption.Inspection of covariate overlap is non-negotiable.
Without sufficient overlap, causal effect estimation may rely heavily on model extrapolation, undermining credibility.Validation exercises are indispensable.
Placebo tests and falsification strategies help ensure that estimated treatment effects are not artifacts of modeling choices or unobserved confounding.
While methodological advances have substantially improved the tools available for causal inference with observational data, their effective application requires rigorous attention to the underlying assumptions and diligent validation to support credible causal claims.