27.10 Multi-Score Regression Discontinuity Design
The Multi-Score Regression Discontinuity Design extends the standard single-score RD and the multi-cutoff RD by introducing multiple running variables that simultaneously determine treatment eligibility. Instead of relying on a single threshold for assignment, treatment now depends on a combination of multiple continuous scores crossing predetermined cutoffs.
Multi-score RD is relevant when policy eligibility is based on multiple criteria, such as:
Education: Honors program admission based on both math and English scores.
Healthcare: Medical trial eligibility based on both BMI and blood pressure levels.
Taxation: Tax incentives based on income level and household size.
27.10.1 General Framework
Each individual \(i\) has:
Two running variables, \(X_{1i}\) and \(X_{2i}\).
Two predetermined cutoffs, \(C_1\) and \(C_2\).
A binary treatment indicator \(D_i\), assigned based on whether the individual’s scores exceed both thresholds.
The treatment effect is defined as:
\[ \tau (x_1, x_2) = E[Y_{1i} - Y_{0i} | X_{1i} = x_1, X_{2i} = x_2]. \]
This represents the local average treatment effect in a two-dimensional RD setting.
27.10.2 Identification
Under the potential outcomes framework, for each individual \(i\), we define:
\(Y_{1i}\): Potential outcome under treatment.
\(Y_{0i}\): Potential outcome under control.
\(D_i\): Treatment assignment rule.
The observed outcome is:
\[ Y_i = D_i Y_{1i} + (1 - D_i) Y_{0i}. \]
The treatment assignment mechanism follows:
\[ D_i = \begin{cases} 1 & \text{if } X_{1i} \geq C_1 \text{ and } X_{2i} \geq C_2, \\ 0 & \text{otherwise}. \end{cases} \]
27.10.3 Key Assumptions
To ensure valid causal inference, the multi-score RD framework extends the standard RD assumptions:
- Continuity of Potential Outcomes in Both Running Variables
- The expected potential outcomes \(E[Y(0) | X_1, X_2]\) and \(E[Y(1) | X_1, X_2]\) are smooth in both \(X_1\) and \(X_2\).
- Formally: \[ \lim_{(x_1, x_2) \to (C_1, C_2)^-} E[Y(0) | X_1 = x_1, X_2 = x_2] = \lim_{(x_1, x_2) \to (C_1, C_2)^+} E[Y(0) | X_1 = x_1, X_2 = x_2]. \]
- Ensures that any observed discontinuity in \(E[Y | X_1, X_2]\) is attributable to treatment.
- No Manipulation of Running Variables
- The density of \((X_1, X_2)\) must be continuous at \((C_1, C_2)\).
- No agents should be able to precisely manipulate both scores to cross the threshold.
- Local Randomization
- Near \((C_1, C_2)\), units are as good as randomly assigned to treatment or control.
- No Interaction Effects in Running Variables (optional)
- In some models, we assume that the effect of crossing \(C_1\) does not depend on \(C_2\) and vice versa.
If these assumptions hold, the treatment effect is identified as the discontinuity in \(E[Y | X_1, X_2]\) at \((C_1, C_2)\).
27.10.4 Estimation Approaches
27.10.4.1 Local Linear Regression in Two Dimensions
The simplest approach is to estimate separate regressions on each side of the cutoff in both dimensions:
For observations below the threshold \((C_1, C_2)\):
\[ Y_i = \alpha + \beta_1 (X_{1i} - C_1) + \beta_2 (X_{2i} - C_2) + \epsilon_i. \]
For observations above the threshold \((C_1, C_2)\):
\[ Y_i = \gamma + \delta_1 (X_{1i} - C_1) + \delta_2 (X_{2i} - C_2) + \tau D_i + \nu_i. \]
The treatment effect \(\tau\) is estimated as:
\[ \hat{\tau} = \hat{E}[Y | X_1 = C_1^+, X_2 = C_2^+] - \hat{E}[Y | X_1 = C_1^-, X_2 = C_2^-]. \]
This approach assumes local linearity, but higher-order polynomials can be used:
\[ Y_i = \alpha + \sum_{k=1}^{K} \beta_k (X_{1i} - C_1)^k + \sum_{k=1}^{K} \gamma_k (X_{2i} - C_2)^k + \tau D_i + \epsilon_i. \]
27.10.4.2 Kernel-Weighted Estimation
A more flexible approach estimates \(\tau(x_1, x_2)\) using nonparametric local regression:
\[ \hat{\tau}(x_1, x_2) = \frac{\sum_{i=1}^{n} K_h (X_{1i} - x_1) K_h (X_{2i} - x_2) D_i Y_i}{\sum_{i=1}^{n} K_h (X_{1i} - x_1) K_h (X_{2i} - x_2) D_i} - \frac{\sum_{i=1}^{n} K_h (X_{1i} - x_1) K_h (X_{2i} - x_2) (1 - D_i) Y_i}{\sum_{i=1}^{n} K_h (X_{1i} - x_1) K_h (X_{2i} - x_2) (1 - D_i)}. \]
where:
\(K_h(\cdot)\) is a kernel function (e.g., Epanechnikov).
\(h\) is the bandwidth, selected via cross-validation.
27.10.4.3 Interaction Model for Heterogeneous Effects
To assess interaction effects between running variables, estimate:
\[ Y_i = \alpha + \beta_1 (X_{1i} - C_1) + \beta_2 (X_{2i} - C_2) + \tau D_i + \lambda D_i (X_{1i} - C_1)(X_{2i} - C_2) + \epsilon_i. \]
- \(\lambda\) captures whether the treatment effect depends on both \(X_1\) and \(X_2\).
27.10.5 Robustness Checks
- Covariate Balance in Both Dimensions
- Test whether pre-treatment covariates jump at \((C_1, C_2)\).
- McCrary Density Test in Two Dimensions
- Verify that density of \((X_1, X_2)\) is smooth at \((C_1, C_2)\).
- Placebo Cutoffs
- Implement fake cutoffs and re-estimate \(\tau(x_1, x_2)\).
- Varying Bandwidths
- Re-estimate using different bandwidths for robustness.