24.10 Evaluation of an RD
Evidence for (either formal tests or graphs)
Treatment and outcomes change discontinuously at the cutoff, while other variables and pre-treatment outcomes do not.
No manipulation of the assignment variable.
Results are robust to various functional forms of the forcing variable
Is there any other (unobserved) confound that could cause the discontinuous change at the cutoff (i.e., multiple forcing variables / bundling of institutions)?
External Validity: How likely the result at the cutoff will generalize?
General Model
\[ Y_i = \beta_0 + f(x_i) \beta_1 + [I(x_i \ge c)]\beta_2 + \epsilon_i \]
where \(f(x_i)\) is any functional form of \(x_i\)
Simple case
When \(f(x_i) = x_i\) (linear function)
\[ Y_i = \beta_0 + x_i \beta_1 + [I(x_i \ge c)]\beta_2 + \epsilon_i \]
RD gives you \(\beta_2\) (causal effect) of \(X\) on \(Y\) at the cutoff point
In practice, everyone does
\[ Y_i = \alpha_0 + f(x) \alpha _1 + [I(x_i \ge c)]\alpha_2 + [f(x_i)\times [I(x_i \ge c)]\alpha_3 + u_i \]
where we estimate different slope on different sides of the line
and if you estimate \(\alpha_3\) to be no different from 0 then we return to the simple case
Notes:
Sparse data can make \(\alpha_3\) large differential effect
People are very skeptical when you have complex \(f(x_i)\), usual simple function forms (e.g., linear, squared term, etc.) should be good. However, if you still insist, then non-parametric estimation can be your best bet.
Bandwidth of \(c\) (window)
Closer to \(c\) can give you lower bias, but also efficiency
Wider \(c\) can increase bias, but higher efficiency.
Optimal bandwidth is very controversial, but usually we have to do it in the appendix for research article anyway.
We can either
drop observations outside of bandwidth or
weight depends on how far and close to \(c\)