24.2 Specification Checks

  1. Balance Checks
  2. Sorting/Bunching/Manipulation
  3. Placebo Tests
  4. Sensitivity to Bandwidth Choice

24.2.1 Balance Checks

  • Also known as checking for Discontinuities in Average Covariates

  • Null Hypothesis: The average effect of covariates on pseudo outcomes (i.e., those qualitatively cannot be affected by the treatment) is 0.

  • If this hypothesis is rejected, you better have a good reason to why because it can cast serious doubt on your RD design.

24.2.2 Sorting/Bunching/Manipulation

  • Also known as checking for A Discontinuity in the Distribution of the Forcing Variable

  • Also known as clustering or density test

  • Formal test is McCrary sorting test (McCrary 2008) or (Cattaneo, Idrobo, and Titiunik 2019)

  • Since human subjects can manipulate the running variable to be just above or below the cutoff (assuming that the running variable is manipulable), especially when the cutoff point is known in advance for all subjects, this can result in a discontinuity in the distribution of the running variable at the cutoff (i.e., we will see “bunching” behavior right before or after the cutoff)>

    • People would like to sort into treatment if it’s desirable. The density of the running variable would be 0 just below the threshold

    • People would like to be out of treatment if it’s undesirable

  • (McCrary 2008) proposes a density test (i.e., a formal test for manipulation of the assignment variable).

    • \(H_0\): The continuity of the density of the running variable (i.e., the covariate that underlies the assignment at the discontinuity point)

    • \(H_a\): A jump in the density function at that point

    • Even though it’s not a requirement that the density of the running must be continuous at the cutoff, but a discontinuity can suggest manipulations.

  • (Zhang and Rubin 2003; Lee 2009; Aronow, Baron, and Pinson 2019) offers a guide to know when you should warrant the manipulation

  • Usually it’s better to know your research design inside out so that you can suspect any manipulation attempts.

    • We would suspect the direction of the manipulation. And typically, it’s one-way manipulation. In cases where we might have both ways, theoretically they would cancel each other out.
  • We could also observe partial manipulation in reality (e.g., when subjects can only imperfectly manipulate). But typically, as we treat it like fuzzy RD, we would not have identification problems. But complete manipulation would lead to serious identification issues.

  • Remember: even in cases where we fail to reject the null hypothesis for the density test, we could not rule out completely that identification problem exists (just like any other hypotheses)

  • Bunching happens when people self-select to a specific value in the range of a variable (e.g., key policy thresholds).

  • Review paper (Kleven 2016)

  • This test can only detect manipulation that changes the distribution of the running variable. If you can choose the cutoff point or you have 2-sided manipulation, this test will fail to detect it.

  • Histogram in bunching is similar to a density curve (we want narrower bins, wider bins bias elasticity estimates)

  • We can also use bunching method to study individuals’ or firm’s responsiveness to changes in policy.

  • Under RD, we assume that we don’t have any manipulation in the running variable. However, bunching behavior is a manipulation by firms or individuals. Thus, violating this assumption.

    • Bunching can fix this problem by estimating what densities of individuals would have been without manipulation (i.e., manipulation-free counterfactual).

    • The fraction of persons who manipulated is then calculated by comparing the observed distribution to manipulation-free counterfactual distributions.

    • Under RD, we do not need this step because the observed and manipulation-free counterfactual distributions are assumed to be the same. RD assume there is no manipulation (i.e., assume the manipulation-free counterfactual distribution)

When running variable and outcome variable are simultaneously determined, we can use a modified RDD estimator to have consistent estimate. (Bajari et al. 2011)

  • Assumptions:

    • Manipulation is one-sided: People move one way (i.e., either below the threshold to above the threshold or vice versa, but not to or away the threshold), which is similar to the monotonicity assumption under instrumental variable

    • Manipulation is bounded (also known as regularity assumption): so that we can use people far away from this threshold to derive at our counterfactual distribution [Blomquist et al. (2021)](Bertanha, McCallum, and Seegert 2021)


  1. Identify the window in which the running variable contains bunching behavior. We can do this step empirically based on Bosch, Dekker, and Strohmaier (2020). Additionally robustness test is needed (i.e., varying the manipulation window).
  2. Estimate the manipulation-free counterfactual
  3. Calculating the standard errors for inference can follow (Chetty, Hendren, and Katz 2016) where we bootstrap re-sampling residuals in the estimation of the counts of individuals within bins (large data can render this step unnecessary).

If we pass the bunching test, we can move on to the Placebo Test

McCrary (2008) test

A jump in the density at the threshold (i.e., discontinuity) hold can serve as evidence for sorting around the cutoff point


# you only need the runing variable and the cutoff point

# Example by the package's authors
#No discontinuity

#> [1] 0.6126802


#> [1] 0.0008519227

Cattaneo, Idrobo, and Titiunik (2019) test


# Example by the package's authors
# Continuous Density
x <- rnorm(2000, mean = -0.5)
rdd <- rddensity(X = x, vce = "jackknife")
#> Manipulation testing using local polynomial density estimation.
#> Number of obs =       2000
#> Model =               unrestricted
#> Kernel =              triangular
#> BW method =           estimated
#> VCE method =          jackknife
#> c = 0                 Left of c           Right of c          
#> Number of obs         1376                624                 
#> Eff. Number of obs    354                 345                 
#> Order est. (p)        2                   2                   
#> Order bias (q)        3                   3                   
#> BW est. (h)           0.514               0.609               
#> Method                T                   P > |T|             
#> Robust                -0.6798             0.4966              
#> P-values of binomial tests (H0: p=0.5).
#> Window Length / 2          <c     >=c    P>|T|
#> 0.036                      28      20    0.3123
#> 0.072                      46      39    0.5154
#> 0.107                      68      59    0.4779
#> 0.143                      94      79    0.2871
#> 0.179                     122     103    0.2301
#> 0.215                     145     130    0.3986
#> 0.250                     163     156    0.7370
#> 0.286                     190     176    0.4969
#> 0.322                     214     200    0.5229
#> 0.358                     249     218    0.1650

# you have to specify your own plot (read package manual)

24.2.3 Placebo Tests

  • Also known as Discontinuities in Average Outcomes at Other Values

  • We should not see any jumps at other values (either \(X_i <c\) or \(X_i \ge c\))

    • Use the same bandwidth you use for the cutoff, and move it along the running variable: testing for a jump in the conditional mean of the outcome at the median of the running variable.
  • Also known as falsification checks

  • Before and after the cutoff point, we can run the placebo test to see whether X’s are different).

  • The placebo test is where you expect your coefficients to be not different from 0.

  • This test can be used for

    • Testing no discontinuity in predetermined variables:

    • Testing other discontinuities

    • Placebo outcomes: we should see any changes in other outcomes that shouldn’t have changed.

    • Inclusion and exclusion of covariates: RDD parameter estimates should not be sensitive to the inclusion or exclusion of other covariates.

  • This is analogous to Experimental Design where we cannot only test whether the observables are similar in both treatment and control groups (if we reject this, then we don’t have random assignment), but we cannot test unobservables.

Balance on observable characteristics on both sides

\[ Z_i = \alpha_0 + \alpha_1 f(x_i) + [I(x_i \ge c)] \alpha_2 + [f(x_i) \times I(x_i \ge c)]\alpha_3 + u_i \]


  • \(x_i\) is the running variable

  • \(Z_i\) is other characteristics of people (e.g., age, etc)

Theoretically, \(Z_i\) should no be affected by treatment. Hence, \(E(\alpha_2) = 0\)

Moreover, when you have multiple \(Z_i\), you typically have to simulate joint distribution (to avoid having significant coefficient based on chance).

The only way that you don’t need to generate joint distribution is when all \(Z_i\)’s are independent (unlikely in reality).

Under RD, you shouldn’t have to do any Matching Methods. Because just like when you have random assignment, there is no need to make balanced dataset before and after the cutoff. If you have to do balancing, then your RD assumptions are probably wrong in the first place.

24.2.4 Sensitivity to Bandwidth Choice

  • Methods for bandwidth selection

  • The objective is to minimize the mean squared error between the estimated and actual treatment effects.

  • Then, we need to see how sensitive our results will be dependent on the choice of bandwidth.

  • In some cases, the best bandwidth for testing covariates may not be the best bandwidth for treating them, but it may be close.

# find optimal bandwidth by Imbens-Kalyanaraman
                 cutpoint = "",
                 kernel = "triangular") # can also pick other kernels


Aronow, Peter M, Jonathon Baron, and Lauren Pinson. 2019. “A Note on Dropping Experimental Subjects Who Fail a Manipulation Check.” Political Analysis 27 (4): 572–89.
Bajari, Patrick, Han Hong, Minjung Park, and Robert Town. 2011. “Regression Discontinuity Designs with an Endogenous Forcing Variable and an Application to Contracting in Health Care.” National Bureau of Economic Research.
Bertanha, Marinho, Andrew H McCallum, and Nathan Seegert. 2021. “Better Bunching, Nicer Notching.” arXiv Preprint arXiv:2101.01170.
Blomquist, Sören, Whitney K Newey, Anil Kumar, and Che-Yuan Liang. 2021. “On Bunching and Identification of the Taxable Income Elasticity.” Journal of Political Economy 129 (8): 2320–43.
Bosch, Nicole, Vincent Dekker, and Kristina Strohmaier. 2020. “A Data-Driven Procedure to Determine the Bunching Window: An Application to the Netherlands.” International Tax and Public Finance 27: 951–79.
Calonico, Sebastian, Matias D Cattaneo, and Max H Farrell. 2020. “Optimal Bandwidth Choice for Robust Bias-Corrected Inference in Regression Discontinuity Designs.” The Econometrics Journal 23 (2): 192–210.
Cattaneo, Matias D, Nicolás Idrobo, and Rocı́o Titiunik. 2019. A Practical Introduction to Regression Discontinuity Designs: Foundations. Cambridge University Press.
Chetty, Raj, Nathaniel Hendren, and Lawrence F Katz. 2016. “The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment.” American Economic Review 106 (4): 855–902.
Kleven, Henrik Jacobsen. 2016. “Bunching.” Annual Review of Economics 8: 435–64.
Lee, David S. 2009. “Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects.” The Review of Economic Studies, 1071–1102.
McCrary, Justin. 2008. “Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.” Journal of Econometrics 142 (2): 698–714.
Zhang, Junni L, and Donald B Rubin. 2003. “Estimation of Causal Effects via Principal Stratification When Some Outcomes Are Truncated by ‘Death’.” Journal of Educational and Behavioral Statistics 28 (4): 353–68.