26.4 Limitations of Quasi-Experiments

Quasi-experimental methods offer valuable tools for causal inference when RCTs are not feasible. However, these designs come with important limitations that must be addressed transparently and rigorously. Because quasi-experiments rely on observational or naturally occurring data, the assumptions required to identify causal effects are generally more stringent, less testable, and more vulnerable to violation than in randomized settings.

Researchers have a responsibility to not only articulate the assumptions underlying their identification strategy, but also to critically assess the threats to both internal and external validity. The credibility of a quasi-experimental study often hinges on the clarity and transparency with which these limitations are confronted.

We structure the discussion around these fundamental questions that should guide both researchers and critical readers:


26.4.1 What Are the Identification Assumptions?

At the heart of any causal claim is a set of assumptions that link the observed data to a counterfactual estimate. In quasi-experiments, these assumptions typically substitute for randomization. Researchers must clearly specify and justify them. For example,

  • Difference-in-Differences relies on the parallel trends assumption—that in the absence of treatment, the treated and control groups would have followed the same outcome trajectory.
  • Regression Discontinuity assumes that units just above and below the cutoff are comparable except for treatment assignment.
  • Instrumental Variables require that the instrument affects the outcome only through the treatment and is not correlated with unobserved confounders.

Best Practice: Use visual diagnostics, historical evidence, and pre-trend analyses to justify the plausibility of your identifying assumptions.


26.4.2 What Are the Threats to Validity?

Even with well-articulated assumptions, quasi-experiments are vulnerable to a variety of threats that may compromise causal inference. These include:

  • Unobserved Confounding: Particularly problematic in cross-sectional designs or when the assignment mechanism is poorly understood.
  • Violation of SUTVA: Spillovers, interference between units, or treatment variation may invalidate the independence assumption.
  • Anticipation Effects or Pre-Treatment Trends: If agents react to anticipated treatments or if trends already diverge pre-treatment, causal interpretation is compromised.
  • Measurement Error in Assignment or Outcomes: In RD designs, mismeasured running variables can attenuate treatment effects or bias discontinuity estimates.
  • Manipulation or Sorting Around a Threshold: In RD or other assignment-based designs, strategic behavior can undermine the quasi-randomness of assignment.
  • Limited Overlap or Support: In IV or matching designs, treatment effects may only be identified for a narrow subpopulation (e.g., compliers).

Best Practice: Explicitly discuss how each threat could operate in your setting, and evaluate their plausibility using data and theory.


26.4.3 How Do You Address These Threats?

Robustness and sensitivity checks are essential in bolstering the credibility of quasi-experimental findings. These may include:

  • Placebo Tests: Check for treatment effects in periods or groups where no treatment occurred.
  • Falsification Outcomes: Analyze outcomes that should not be affected by the treatment.
  • Pre-Trend Diagnostics: For DiD, plot outcome trends before the intervention to assess the plausibility of the parallel trends assumption.
  • Bandwidth and Polynomial Sensitivity (in RD): Vary bandwidths and functional forms to assess the stability of estimates.
  • Alternative Specifications: Use different model formulations, control sets, or matching procedures to test robustness.
  • Heterogeneity Analysis: Examine treatment effects across subgroups to reveal where assumptions may break down.

Best Practice: Present robustness checks graphically where possible. Transparency in presenting weaknesses enhances credibility.


26.4.4 What Are the Implications for External Validity and Future Research?

Quasi-experimental estimates are often highly context-specific, and generalizing beyond the study setting requires caution.

  • Limited Scope of Identification: Many designs estimate Local Average Treatment Effects rather than Average Treatment Effects.
  • Sample and Setting Specificity: Estimates may reflect idiosyncrasies of a particular institution, geography, or time period.
  • Policy Relevance: Consider whether the scale or nature of the intervention reflects real-world applications.

Future research directions often emerge from the limitations of a quasi-experimental study. Examples include:

  • Identifying new sources of exogenous variation or natural experiments.
  • Combining quasi-experimental evidence with structural modeling to improve extrapolation.
  • Developing better diagnostics for assumption testing.
  • Replicating findings in other settings or populations.

Examples of Limitations and Responses
Threat Example Design Diagnostic Tool Remedy or Check
Unobserved Confounding Matching, DiD Pre-trend test, covariate balance Sensitivity analysis, falsification outcomes
Violation of Parallel Trends DiD Pre-treatment trends plot Group-specific trends, triple differences
Manipulation of Assignment Variable RD Density test (McCrary) Exclude manipulated region, use fuzzy RD
Spillovers / Interference All Spatial or network data analysis Clustered design, SUTVA discussion
Narrow Population of Compliers IV Covariate balance for compliers Bounding methods, report LATE clearly

Quasi-experimental methods are powerful, but their strength lies not in perfection, but in transparency. A well-documented quasi-experiment, with clear limitations and open discussion of assumptions, often contributes more to scientific knowledge than a poorly reported experiment. As we proceed through the chapters, you will see both exemplary and problematic applications of these methods—with an emphasis on the discipline required to make credible causal claims from imperfect data.