24.2 Specification Checks
24.2.1 Balance Checks
Also known as checking for Discontinuities in Average Covariates
Null Hypothesis: The average effect of covariates on pseudo outcomes (i.e., those qualitatively cannot be affected by the treatment) is 0.
If this hypothesis is rejected, you better have a good reason to why because it can cast serious doubt on your RD design.
24.2.2 Sorting/Bunching/Manipulation
Also known as checking for A Discontinuity in the Distribution of the Forcing Variable
Also known as clustering or density test
Formal test is McCrary sorting test (McCrary 2008) or (Cattaneo, Idrobo, and Titiunik 2019)
Since human subjects can manipulate the running variable to be just above or below the cutoff (assuming that the running variable is manipulable), especially when the cutoff point is known in advance for all subjects, this can result in a discontinuity in the distribution of the running variable at the cutoff (i.e., we will see “bunching” behavior right before or after the cutoff)>
People would like to sort into treatment if it’s desirable. The density of the running variable would be 0 just below the threshold
People would like to be out of treatment if it’s undesirable
(McCrary 2008) proposes a density test (i.e., a formal test for manipulation of the assignment variable).
\(H_0\): The continuity of the density of the running variable (i.e., the covariate that underlies the assignment at the discontinuity point)
\(H_a\): A jump in the density function at that point
Even though it’s not a requirement that the density of the running must be continuous at the cutoff, but a discontinuity can suggest manipulations.
(J. L. Zhang and Rubin 2003; Lee 2009; Aronow, Baron, and Pinson 2019) offers a guide to know when you should warrant the manipulation
Usually it’s better to know your research design inside out so that you can suspect any manipulation attempts.
- We would suspect the direction of the manipulation. And typically, it’s one-way manipulation. In cases where we might have both ways, theoretically they would cancel each other out.
We could also observe partial manipulation in reality (e.g., when subjects can only imperfectly manipulate). But typically, as we treat it like fuzzy RD, we would not have identification problems. But complete manipulation would lead to serious identification issues.
Remember: even in cases where we fail to reject the null hypothesis for the density test, we could not rule out completely that identification problem exists (just like any other hypotheses)
Bunching happens when people self-select to a specific value in the range of a variable (e.g., key policy thresholds).
Review paper (Kleven 2016)
This test can only detect manipulation that changes the distribution of the running variable. If you can choose the cutoff point or you have 2-sided manipulation, this test will fail to detect it.
Histogram in bunching is similar to a density curve (we want narrower bins, wider bins bias elasticity estimates)
We can also use bunching method to study individuals’ or firm’s responsiveness to changes in policy.
Under RD, we assume that we don’t have any manipulation in the running variable. However, bunching behavior is a manipulation by firms or individuals. Thus, violating this assumption.
Bunching can fix this problem by estimating what densities of individuals would have been without manipulation (i.e., manipulation-free counterfactual).
The fraction of persons who manipulated is then calculated by comparing the observed distribution to manipulation-free counterfactual distributions.
Under RD, we do not need this step because the observed and manipulation-free counterfactual distributions are assumed to be the same. RD assume there is no manipulation (i.e., assume the manipulation-free counterfactual distribution)
When running variable and outcome variable are simultaneously determined, we can use a modified RDD estimator to have consistent estimate. (Bajari et al. 2011)
Assumptions:
Manipulation is one-sided: People move one way (i.e., either below the threshold to above the threshold or vice versa, but not to or away the threshold), which is similar to the monotonicity assumption under instrumental variable 31.1.3.1
Manipulation is bounded (also known as regularity assumption): so that we can use people far away from this threshold to derive at our counterfactual distribution [Blomquist et al. (2021)](Bertanha, McCallum, and Seegert 2021)
Steps:
- Identify the window in which the running variable contains bunching behavior. We can do this step empirically based on Bosch, Dekker, and Strohmaier (2020). Additionally robustness test is needed (i.e., varying the manipulation window).
- Estimate the manipulation-free counterfactual
- Calculating the standard errors for inference can follow (Chetty, Hendren, and Katz 2016) where we bootstrap re-sampling residuals in the estimation of the counts of individuals within bins (large data can render this step unnecessary).
If we pass the bunching test, we can move on to the Placebo Test
McCrary (2008) test
A jump in the density at the threshold (i.e., discontinuity) hold can serve as evidence for sorting around the cutoff point
library(rdd)
# you only need the runing variable and the cutoff point
# Example by the package's authors
#No discontinuity
x<-runif(1000,-1,1)
DCdensity(x,0)
#> [1] 0.6126802
#Discontinuity
x<-runif(1000,-1,1)
x<-x+2*(runif(1000,-1,1)>0&x<0)
DCdensity(x,0)
#> [1] 0.0008519227
Cattaneo, Idrobo, and Titiunik (2019) test
library(rddensity)
# Example by the package's authors
# Continuous Density
set.seed(1)
x <- rnorm(2000, mean = -0.5)
rdd <- rddensity(X = x, vce = "jackknife")
summary(rdd)
#>
#> Manipulation testing using local polynomial density estimation.
#>
#> Number of obs = 2000
#> Model = unrestricted
#> Kernel = triangular
#> BW method = estimated
#> VCE method = jackknife
#>
#> c = 0 Left of c Right of c
#> Number of obs 1376 624
#> Eff. Number of obs 354 345
#> Order est. (p) 2 2
#> Order bias (q) 3 3
#> BW est. (h) 0.514 0.609
#>
#> Method T P > |T|
#> Robust -0.6798 0.4966
#>
#>
#> P-values of binomial tests (H0: p=0.5).
#>
#> Window Length / 2 <c >=c P>|T|
#> 0.036 28 20 0.3123
#> 0.072 46 39 0.5154
#> 0.107 68 59 0.4779
#> 0.143 94 79 0.2871
#> 0.179 122 103 0.2301
#> 0.215 145 130 0.3986
#> 0.250 163 156 0.7370
#> 0.286 190 176 0.4969
#> 0.322 214 200 0.5229
#> 0.358 249 218 0.1650
# you have to specify your own plot (read package manual)
24.2.3 Placebo Tests
Also known as Discontinuities in Average Outcomes at Other Values
We should not see any jumps at other values (either \(X_i <c\) or \(X_i \ge c\))
- Use the same bandwidth you use for the cutoff, and move it along the running variable: testing for a jump in the conditional mean of the outcome at the median of the running variable.
Also known as falsification checks
Before and after the cutoff point, we can run the placebo test to see whether X’s are different).
The placebo test is where you expect your coefficients to be not different from 0.
This test can be used for
Testing no discontinuity in predetermined variables:
Testing other discontinuities
Placebo outcomes: we should see any changes in other outcomes that shouldn’t have changed.
Inclusion and exclusion of covariates: RDD parameter estimates should not be sensitive to the inclusion or exclusion of other covariates.
This is analogous to Experimental Design where we cannot only test whether the observables are similar in both treatment and control groups (if we reject this, then we don’t have random assignment), but we cannot test unobservables.
Balance on observable characteristics on both sides
\[ Z_i = \alpha_0 + \alpha_1 f(x_i) + [I(x_i \ge c)] \alpha_2 + [f(x_i) \times I(x_i \ge c)]\alpha_3 + u_i \]
where
\(x_i\) is the running variable
\(Z_i\) is other characteristics of people (e.g., age, etc)
Theoretically, \(Z_i\) should no be affected by treatment. Hence, \(E(\alpha_2) = 0\)
Moreover, when you have multiple \(Z_i\), you typically have to simulate joint distribution (to avoid having significant coefficient based on chance).
The only way that you don’t need to generate joint distribution is when all \(Z_i\)’s are independent (unlikely in reality).
Under RD, you shouldn’t have to do any Matching Methods. Because just like when you have random assignment, there is no need to make balanced dataset before and after the cutoff. If you have to do balancing, then your RD assumptions are probably wrong in the first place.
24.2.4 Sensitivity to Bandwidth Choice
Methods for bandwidth selection
Ad-hoc or substantively driven
Data driven: cross validation
Conservative approach: (Calonico, Cattaneo, and Farrell 2020)
The objective is to minimize the mean squared error between the estimated and actual treatment effects.
Then, we need to see how sensitive our results will be dependent on the choice of bandwidth.
In some cases, the best bandwidth for testing covariates may not be the best bandwidth for treating them, but it may be close.