36.2 Endogenous Sample Selection
Selection into treatment does not occur randomly in observational studies, leading to selection bias—a major challenge in causal inference. Individuals often choose whether to participate in a treatment based on personal characteristics, external incentives, or underlying risk factors. This selection process can introduce systematic differences between the treatment and control groups, biasing the estimated treatment effects.
Selection bias typically arises from two opposing sources:
36.2.1 Mitigation-Based Selection
- Individuals select into treatment to combat a problem they already face.
- This creates a negative selection bias—those who take treatment are systematically worse off compared to those who do not.
- Example:
- People at high risk of severe illness (e.g., elderly or immunocompromised individuals) are more likely to get vaccinated. If we compare vaccinated vs. unvaccinated individuals without adjusting for risk factors, we might mistakenly conclude that vaccines are ineffective simply because vaccinated individuals had worse initial health conditions.
36.2.2 Preference-Based Selection
- Individuals select into treatment because they inherently prefer it, rather than because of an underlying problem.
- This creates a positive selection bias—those who take treatment are systematically better off compared to those who do not.
- Example:
- People who are health-conscious and physically active are more likely to join a fitness program. If we compare fitness program participants to non-participants, we might falsely attribute their better health outcomes to the program, when in reality, their pre-existing lifestyle contributed to their improved health.