24.10 Evaluation of an RD

  • Evidence for (either formal tests or graphs)

    • Treatment and outcomes change discontinuously at the cutoff, while other variables and pre-treatment outcomes do not.

    • No manipulation of the assignment variable.

  • Results are robust to various functional forms of the forcing variable

  • Is there any other (unobserved) confound that could cause the discontinuous change at the cutoff (i.e., multiple forcing variables / bundling of institutions)?

  • External Validity: How likely the result at the cutoff will generalize?

General Model

\[ Y_i = \beta_0 + f(x_i) \beta_1 + [I(x_i \ge c)]\beta_2 + \epsilon_i \]

where \(f(x_i)\) is any functional form of \(x_i\)

Simple case

When \(f(x_i) = x_i\) (linear function)

\[ Y_i = \beta_0 + x_i \beta_1 + [I(x_i \ge c)]\beta_2 + \epsilon_i \]

RD gives you \(\beta_2\) (causal effect) of \(X\) on \(Y\) at the cutoff point

In practice, everyone does

\[ Y_i = \alpha_0 + f(x) \alpha _1 + [I(x_i \ge c)]\alpha_2 + [f(x_i)\times [I(x_i \ge c)]\alpha_3 + u_i \]

where we estimate different slope on different sides of the line

and if you estimate \(\alpha_3\) to be no different from 0 then we return to the simple case


  • Sparse data can make \(\alpha_3\) large differential effect

  • People are very skeptical when you have complex \(f(x_i)\), usual simple function forms (e.g., linear, squared term, etc.) should be good. However, if you still insist, then non-parametric estimation can be your best bet.

Bandwidth of \(c\) (window)

  • Closer to \(c\) can give you lower bias, but also efficiency

  • Wider \(c\) can increase bias, but higher efficiency.

  • Optimal bandwidth is very controversial, but usually we have to do it in the appendix for research article anyway.

  • We can either

    • drop observations outside of bandwidth or

    • weight depends on how far and close to \(c\)