27.2 Model Estimation Strategies
Under regression discontinuity framework, researchers can choose between parametric and nonparametric models. The choice depends on assumptions about functional form, data availability, and the trade-off between flexibility and interpretability.
27.2.1 Parametric Models: Polynomial Regression
Parametric models, such as linear regression, assume a specific functional form for the relationship between the dependent variable and predictors. One way to relax the strict linearity assumption is by incorporating polynomial functions of the forcing variable (Lee and Lemieux 2010). The choice of polynomial degree should be determined based on data characteristics.
However, using high-order polynomials comes with challenges. Gelman and Imbens (2019) highlight three key issues associated with global high-degree polynomials:
- Imprecise Estimates Due to Noise: Higher-degree polynomials can overfit, capturing noise instead of meaningful patterns.
- Sensitivity to Polynomial Degree: Estimates can vary significantly depending on the chosen degree, making the model less stable.
- Inadequate Confidence Interval Coverage: Confidence intervals tend to be misleading when using high-degree polynomials, leading to incorrect inference.
For these reasons, researchers are often advised to avoid global high-order polynomials and instead rely on alternative approaches (i.e., nonparametric version).
27.2.2 Nonparametric Models: Local Regression
Nonparametric models, such as local polynomial regression, offer greater flexibility by avoiding strong assumptions about functional form. Instead of fitting a single equation to the entire dataset, these methods estimate relationships locally—often using linear or quadratic polynomials within a neighborhood of each data point.
- Uses weighted observations near the cutoff.
- More flexible than global polynomial regression.
Local regression methods, such as local linear regression, address the shortcomings of high-order global polynomials by:
- Reducing overfitting, as they adapt to local patterns without excessive complexity.
- Providing stable estimates, since results are less sensitive to arbitrary choices of polynomial degree.
- Improving inference, ensuring more reliable confidence intervals.
By balancing flexibility and robustness, local regression techniques are often preferred over global polynomial models in applied research.
Best Practice: Use multiple model specifications to check for consistency in results.