24.11 Applications
Examples in marketing:
(Hartmann, Nair, and Narayanan 2011): nonparametric estimation and guide to identifying causal marketing mix effects
Packages in R (see (Thoemmes, Liao, and Jin 2017) for detailed comparisons): all can handle both sharp and fuzzy RD
rdd
rdrobust
estimation, inference and plotrddensity
discontinuity in density tests (Sorting/Bunching/Manipulation) using local polynomials and binomial testrdlocrand
covariate balance, binomial tests, window selectionrdmulti
multiple cutoffs and multiple scoresrdpower
power, sample selectionrddtools
Package | rdd | rdrobust | rddtools |
---|---|---|---|
Coefficient estimator | Local linear regression | local polynomial regression | local polynomial regression |
bandwidth selectors | (G. Imbens and Kalyanaraman 2012) | (Calonico, Cattaneo, and Farrell 2020) |
(G. Imbens and Kalyanaraman 2012) |
Kernel functions
|
Epanechnikov Gaussian |
Epanechnikov | Gaussian |
Bias Correction | Local polynomial regression | ||
Covariate options | Include | Include | Include Residuals |
Assumptions testing | McCrary sorting | McCrary sorting Equality of covariates distribution and mean |
based on table 1 (Thoemmes, Liao, and Jin 2017) (p. 347)
24.11.1 Example 1
Example by Leihua Ye
\[ Y_i = \beta_0 + \beta_1 X_i + \beta_2 W_i + u_i \]
\[ X_i = \begin{cases} 1, W_i \ge c \\ 0, W_i < c \end{cases} \]
#cutoff point = 3.5
GPA <- runif(1000, 0, 4)
future_success <- 10 + 2 * GPA + 10 * (GPA >= 3.5) + rnorm(1000)
#install and load the package ‘rddtools’
#install.packages(“rddtools”)
library(rddtools)
data <- rdd_data(future_success, GPA, cutpoint = 3.5)
# plot the dataset
plot(
data,
col = "red",
cex = 0.1,
xlab = "GPA",
ylab = "future_success"
)
# estimate the sharp RDD model
rdd_mod <- rdd_reg_lm(rdd_object = data, slope = "same")
summary(rdd_mod)
#>
#> Call:
#> lm(formula = y ~ ., data = dat_step1, weights = weights)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.90364 -0.70348 0.00278 0.66828 3.00603
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 16.90704 0.06637 254.75 <2e-16 ***
#> D 10.09058 0.11063 91.21 <2e-16 ***
#> x 1.97078 0.03281 60.06 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.9908 on 997 degrees of freedom
#> Multiple R-squared: 0.9654, Adjusted R-squared: 0.9654
#> F-statistic: 1.392e+04 on 2 and 997 DF, p-value: < 2.2e-16
24.11.2 Example 2
Bowblis and Smith (2021)
Occupational licensing can either increase or decrease market efficiency:
More information means more efficiency
Increased entry barriers (i.e., friction) increase efficiency
Components of RD
- Running variable
- Cutoff: 120 beds or above
- Treatment: you have to have the treatment before the cutoff point.
Under OLS
\[ Y_i = \alpha_0 + X_i \alpha_1 + LW_i \alpha_2 + \epsilon_i \]
where
\(LW_i\) Licensed/certified workers (in fraction format for each center).
\(Y_i\) = Quality of service
Bias in \(\alpha_2\)
Mitigation-based: terrible quality can lead to more hiring, which negatively bias \(\alpha_2\)
Preference-based: places that have higher quality staff want to keep high quality staffs.
Under RD
\[ \begin{aligned} Y_{ist} &= \beta_0 + [I(Bed \ge121)_{ist}]\beta_1 + f(Size_{ist}) \beta_2\\ &+ [f(Size_{ist}) \times I(Bed \ge 121)_{ist}] \beta_3 \\ &+ X_{it} \delta + \gamma_s + \theta_t + \epsilon_{ist} \end{aligned} \]
where
\(s\) = state
\(t\) = year
\(i\) = hospital
This RD is fuzzy
If right near the threshold (bandwidth), we have states with different sorting (i.e., non-random), then we need the fixed-effect for state \(s\). But then your RD assumption wrong anyway, then you won’t do it in the first place
Technically, we could also run the fixed-effect regression, but because it’s lower in the causal inference hierarchy. Hence, we don’t do it.
Moreover, in the RD framework, we don’t include \(t\) before treatment (but in the FE we have to include before and after)
If we include \(\pi_i\) for each hospital, then we don’t have variation in the causal estimates (because hardly any hospital changes their bed size in the panel)
When you have \(\beta_1\) as the intent to treat (because the treatment effect does not coincide with the intent to treat)
You cannot take those fuzzy cases out, because it will introduce the selection bias.
Note that we cannot drop cases based on behavioral choice (because we will exclude non-compliers), but we can drop when we have particular behaviors ((e.g., people like round numbers).
Thus, we have to use Instrument variable 33.1.3.1
Stage 1:
\[ \begin{aligned} QSW_{ist} &= \alpha_0 + [I(Bed \ge121)_{ist}]\alpha_1 + f(Size_{ist}) \alpha_2\\ &+ [f(Size_{ist}) \times I(Bed \ge 121)_{ist}] \alpha_3 \\ &+ X_{it} \delta + \gamma_s + \theta_t + \epsilon_{ist} \end{aligned} \]
(Note: you should have different fixed effects and error term - \(\delta, \gamma_s, \theta_t, \epsilon_{ist}\) from the first equation, but I ran out of Greek letters)
Stage 2:
\[ \begin{aligned} Y_{ist} &= \gamma_0 + \gamma_1 \hat{QWS}_{ist} + f(Size_{ist}) \delta_2 \\ &+ [f(Size_{ist}) \times I(Bed \ge 121)] \delta_3 \\ &+ X_{it} \lambda + \eta_s + \tau_t + u_{ist} \end{aligned} \]
The bigger the jump (discontinuity), the more similar the 2 coefficients (\(\gamma_1 \approx \beta_1\)) where \(\gamma_1\) is the average treatment effect (of exposing to the policy)
\(\beta_1\) will always be closer to 0 than \(\gamma_1\)
Figure 1 shows bunching at every 5 units cutoff, but 120 is still out there.
If we have manipulable bunching, there should be decrease at 130
Since we have limited number of mass points (at the round numbers), we should clustered standard errors by the mass point
24.11.3 Example 3
Replication of (Carpenter and Dobkin 2009) by Philipp Leppert, dataset from here
24.11.4 Example 4
For a detailed application, see (Thoemmes, Liao, and Jin 2017) where they use rdd
, rdrobust
, rddtools