30.6 Treatment Intensity
Two-Stage Least Squares (TSLS) can be used to estimate the average causal effect of variable treatment intensity, and it “identifies a weighted average of per-unit treatment effects along the length of a causal response function” (J. D. Angrist and Imbens 1995, 431). For example
Drug dosage
Hours of exam prep on score (Powers and Swinton 1984)
Cigarette smoking on birth weights (Permutt and Hebel 1989)
Years of education
Class size on test score (J. D. Angrist and Lavy 1999)
Sibship size on earning (Lavy, Angrist, and Schlosser 2006)
Social Media Adoption
The average causal effect here refers to the conditional expectation of the difference in outcomes between the treated and what would have happened in the counterfactual world.
Notes:
- We do not need a linearity assumption of the relationships between the dependent variable, treatment intensities, and instruments.
Example
In their original paper, J. D. Angrist and Imbens (1995) take the example of schooling effect on earnings where they have quarters of birth as the instrumental variable.
For each additional year of schooling, there can be an increase in earnings, and each additional year can be heterogeneous (both in the sense that grade 9th to grade 10th is qualitatively different and one can change to a different school).
\[ Y = \gamma_0 + \gamma_1 X_1 + \rho S + \epsilon \]
where
\(S\) is years of schooling (i.e., endogenous regressor)
\(\rho\) is the return to a year of schooling
\(X_1\) is a matrix of exogenous covariates
Schooling can also be related to the exogenous variable \(X_1\)
\[ S = \delta_0 + X_1 \delta_1 + X_2 \delta_2 + \eta \]
where
\(X_2\) is an exogenous instrument
\(\delta_2\) is the coefficient of the instrument
by using only the fitted value in the second, the TSLS can give a consistent estimate of the effect of schooling on earning
\[ Y = \gamma_0 + X_1 \gamma-1 + \rho \hat{S} + \nu \]
To give \(\rho\) a causal interpretation,
- We first have to have the SUTVA (stable unit treatment value assumption), where the potential outcomes of the same person with different years of schooling are independent.
- When \(\rho\) has a probability limit equal to a weighted average of \(E[Y_j - Y_{j-1}] \forall j\)
Even though the first bullet point is not trivial, most of the time we don’t have to defend much about it in a research article, the second bullet point is the harder one to argue and only apply to certain cases.