30.1 Framework
\(D_i \sim Bern\) Dummy Treatment
\(Y_{0i}, Y_{1i}\) potential outcomes
\(Y_i = Y_{0i} + (Y_{1i} - Y_{0i}) D_i\) observed outcome
\(Z_i \perp Y_{0i}, Y_{1i}\) Instrumental variables (and also correlate with \(D_i\))
Under constant-effects and linear (\(Y_{1i} - Y_{0i}\) are the same for everyone)
\[ \begin{aligned} Y_{0i} &= \alpha + \eta_i \\ Y_{1i} - Y_{0i} &= \rho \\ Y_i &= Y_{0i} + D_i (Y_{1i} - Y_{0i}) \\ &= \alpha + \eta_i + D_i \rho \\ &= \alpha + \rho D_i + \eta_i \end{aligned} \]
where
\(\eta_i\) is individual differences
\(\rho\) is the difference between treated outcome and untreated outcome. Here we assume they are constant for everyone
However, we have a problem with OLS because \(D_i\) is correlated with \(\eta_i\) for each unit
But \(Z_i\) can come to the rescue, the causal estimate can be written as
\[ \begin{aligned} \rho &= \frac{Cov( Y_i, Z_i)}{Cov(D_i, Z_i)} \\ &= \frac{Cov(Y_i, Z_i) / V(Z_i) }{Cov( D_i, Z_i) / V(Z_i)} = \frac{Reduced form}{First-stage} \\ &= \frac{E[Y_i |Z_i = 1] - E[Y_i | Z_i = 0]}{E[D_i |Z_i = 1] - E[D_i | Z_i = 0 ]} \end{aligned} \]
Under heterogeneous treatment effect (\(Y_{1i} - Y_{0i}\) are different for everyone) with LATE framework
\(Y_i(d,z)\) denotes the potential outcome for unit \(i\) with treatment \(D_i = d\) and instrument \(Z_i = z\)
Observed treatment status
\[ D_i = D_{0i} + Z_i (D_{1i} - D_{0i}) \]
where
\(D_{1i}\) is treatment status of unit \(i\) when \(z_i = 1\)
\(D_{0i}\) is treatment status of unit \(i\) when \(z_i = 0\)
\(D_{1i} - D_{0i}\) is the causal effect of \(Z_i\) on \(D_i\)
Assumptions
Independence: The instrument is randomly assigned (i.e., independent of potential outcomes and potential treatments)
\([\{Y_i(d,z); \forall d, z \}, D_{1i}, D_{0i} ] \Pi Z_i\)
This assumption let the first-stage equation be the average causal effect of \(Z_i\) on \(D_i\)
\[ \begin{aligned} E[D_i |Z_i = 1] - E[D_i | Z_i = 0] &= E[D_{1i} |Z_i = 1] - E[D_{0i} |Z_i = 0] \\ &= E[D_{1i} - D_{0i}] \end{aligned} \]
- This assumption also is sufficient for a causal interpretation of the reduced form, where we see the effect of the instrument on the outcome.
\[ E[Y_i |Z_i = 1 ] - E[Y_i|Z_i = 0] = E[Y_i (D_{1i}, Z_i = 1) - Y_i (D_{0i} , Z_i = 0)] \]
Exclusion (i.e., existence of instruments (G. W. Imbens and Angrist 1994)
- The treatment \(D_i\) fully mediates the effect of \(Z_i\) on \(Y_i\)
\[ Y_{1i} = Y_i (1,1) = Y_i (1,0) \\ Y_{0i} = Y_i (0,1) = Y_i (0, 0) \]
- With this assumption, the observed outcome \(Y_i\) can be thought of as (assume \(Y_{1i}, Y_{0i}\) already satisfy the independence assumption)
\[ \begin{aligned} Y_i &= Y_i (0, Z_i) + [Y_i (1 , Z_i) - Y_i (0, Z_i)] D_i \\ &= Y_{0i} + (Y_{1i} - Y_{0i} ) D_i \end{aligned} \]
- This assumption let us go from reduced-form causal effects to treatment effects (J. D. Angrist and Imbens 1995)
Monotonicity: \(D_{1i} > D_{0i} \forall i\)
With this assumption, we have \(E[D_{1i} - D_{0i} ] = P[D_{1i} > D_{0i}]\)
This assumption lets us assume that there is a first stage, in which we examine the proportion of the population that \(D_i\) is driven by \(Z_i\)
This assumption is used to solve to problem of the shifts between participation status back to non-participation status.
Alternatively, one can solve the same problem by assuming constant (homogeneous) treatment effect (G. W. Imbens and Angrist 1994), but this is rather restrictive.
A third solution is the assumption that there exists a value of the instrument, where the probability of participation conditional on that value is 0 J. Angrist and Imbens (1991).
With these three assumptions, we have the LATE theorem (J. D. Angrist and Pischke 2009, 4.4.1)
\[ \frac{E[Y_i | Z_i = 1] - E[Y_i | Z_i = 0]}{E[D_i |Z_i = 1] - E[D_i |Z_i = 0]} = E[Y_{1i} - Y_{0i} | D_{1i} > D_{0i}] \]
LATE assumptions allow us to go back to the types of subjects we have in Causal Inference
Switchers:
- Compliers: \(D_{1i} > D_{0i}\)
Non-switchers:
Always-takers: \(D_{1i} = D_{0i} = 1\)
Never-takers: \(D_{1i} = D_{0i} = 0\)
Instrumental Variables can’t say anything about non-switchers because treatment status \(D_i\) has no effects on them (similar to fixed effects models).
When all groups are the same, we come back to the constant-effects world.
Treatment effects on the treated is a weighted average of always-takers and compliers.
In the special case of IV in randomized trials, we have a compliance problem (when compliance is voluntary), where those in the treated will not always take the treatment (i.e., might be selection bias).
Intention-to-treat analysis is valid, but contaminated by non-compliance
IV in this case (\(Z_i\) = random assignment to the treatment; \(D_i\) = whether the unit actually received/took the treatment) can solve this problem.
Under certain assumptions (i.e., SUTVA, random assignment, exclusion restriction, no defiers, and monotinicity), this analysis can give causal interpreation of LATE because it’s the average causal effect for the compliers only.
- Without these assumptions, it’s a ratio of intention-to-treat.
Without always-takers in this case, LATE = Treatment effects on the treated
See proof Bloom (1984) and examples Bloom et al. (1997) and Sherman and Berk (1984)
\[ \frac{E[Y_i |Z_i = 1] - E[Y_i |Z_i = 0]}{E[D_i |Z_i = 1]} = \frac{\text{Intention-to-treat effect}}{\text{Compliance rate}} \\ = E[Y_{1i} - Y_{0i} |D_i = 1] \]