34.2 Framework for Instrumental Variables

We consider a binary treatment framework where:

$D_i \sim Bernoulli(p)$ is a dummy treatment variable.
$(Y_{0i}, Y_{1i})$ are the potential outcomes under control and treatment.
The observed outcome is: $Y_i = Y_{0i} + (Y_{1i} - Y_{0i}) D_i.$
We introduce an instrumental variable $Z_i$ satisfying: $Z_i \perp (Y_{0i}, Y_{1i}, D_{0i}, D_{1i}).$
- This means $Z_i$ is independent of potential outcomes and potential treatment status.
- $Z_i$ must also be correlated with $D_i$ to satisfy the relevance condition.

34.2.1 Constant-Treatment-Effect Model

Under the constant treatment effect assumption (i.e., the treatment effect is the same for all individuals),

$\begin{aligned} Y_{0i} &= \alpha + \eta_i, \\ Y_{1i} - Y_{0i} &= \rho, \\ Y_i &= Y_{0i} + D_i (Y_{1i} - Y_{0i}) \\ &= \alpha + \eta_i + D_i \rho \\ &= \alpha + \rho D_i + \eta_i. \end{aligned}$

where:

$\eta_i$ captures individual-level heterogeneity.
$\rho$ is the constant treatment effect.

The problem with OLS estimation is that $D_i$ may be correlated with $\eta_i$ , leading to endogeneity bias.

34.2.2 Instrumental Variable Solution

A valid instrument $Z_i$ allows us to estimate the causal effect $\rho$ via:

$\begin{aligned} \rho &= \frac{\text{Cov}(Y_i, Z_i)}{\text{Cov}(D_i, Z_i)} \\ &= \frac{\text{Cov}(Y_i, Z_i) / V(Z_i) }{\text{Cov}(D_i, Z_i) / V(Z_i)} \\ &= \frac{\text{Reduced form estimate}}{\text{First-stage estimate}} \\ &= \frac{E[Y_i |Z_i = 1] - E[Y_i | Z_i = 0]}{E[D_i |Z_i = 1] - E[D_i | Z_i = 0 ]}. \end{aligned}$

This ratio measures the treatment effect only if $Z_i$ is a valid instrument.

34.2.3 Heterogeneous Treatment Effects and the LATE Framework

In a more general framework where treatment effects vary across individuals,

Define potential outcomes as: $Y_i(d,z) = \text{outcome for unit } i \text{ given } D_i = d, Z_i = z.$
Define treatment status based on $Z_i$ : $D_i = D_{0i} + Z_i (D_{1i} - D_{0i}).$

where:
- $D_{1i}$ is the treatment status when $Z_i = 1$ .
- $D_{0i}$ is the treatment status when $Z_i = 0$ .
- $D_{1i} - D_{0i}$ is the causal effect of $Z_i$ on $D_i$ .

34.2.4 Assumptions for LATE Identification

34.2.4.1 Independence (Instrument Randomization)

The instrument must be as good as randomly assigned:

$[\{Y_i(d,z); \forall d, z \}, D_{1i}, D_{0i} ] \perp Z_i.$

This ensures that $Z_i$ is uncorrelated with potential outcomes and potential treatment status.

This assumption let the first-stage equation be the average causal effect of $Z_i$ on $D_i$

$\begin{aligned} E[D_i |Z_i = 1] - E[D_i | Z_i = 0] &= E[D_{1i} |Z_i = 1] - E[D_{0i} |Z_i = 0] \\ &= E[D_{1i} - D_{0i}] \end{aligned}$

This assumption also is sufficient for a causal interpretation of the reduced form, where we see the effect of the instrument $Z_i$ on the outcome $Y_i$ :

$E[Y_i |Z_i = 1 ] - E[Y_i|Z_i = 0] = E[Y_i (D_{1i}, Z_i = 1) - Y_i (D_{0i} , Z_i = 0)]$

34.2.4.2 Exclusion Restriction

This is also known as the existence of the instrument assumption (G. W. Imbens and Angrist 1994). The instrument should only affect $Y_i$ through $D_i$ (i.e., the treatment $D_i$ fully mediates the effect of $Z_i$ on $Y_i$ ):

$\begin{aligned} Y_{1i} &= Y_i (1,1) = Y_i (1,0)\\ Y_{0i} &= Y_i (0,1) = Y_i (0,0) \end{aligned}$

Under this assumption (and assume $Y_{1i, Y_{0i}}$ already satisfy the independence assumption), the observed outcome $Y_i$ can be rewritten as:

$\begin{aligned} Y_i &= Y_i (0, Z_i) + [Y_i (1 , Z_i) - Y_i (0, Z_i)] D_i \\ &= Y_{0i} + (Y_{1i} - Y_{0i}) D_i. \end{aligned}$

This assumption let us go from reduced-form causal effects to treatment effects (J. D. Angrist and Imbens 1995).

34.2.4.3 Monotonicity (No Defiers)

We assume that $Z_i$ affects $D_i$ in a monotonic way:

$D_{1i} \geq D_{0i}, \quad \forall i.$

This assumption lets us assume that there is a first stage, in which we examine the proportion of the population that $D_i$ is driven by $Z_i$ . It implies that $Z_i$ only moves individuals toward treatment, but never away. This rules out “defiers” (i.e., individuals who would have taken the treatment when not assigned but refuse when assigned).
This assumption is used to solve to problem of the shifts between participation status back to non-participation status.
- Alternatively, one can solve the same problem by assuming constant (homogeneous) treatment effect (G. W. Imbens and Angrist 1994), but this is rather restrictive.
- A third solution is the assumption that there exists a value of the instrument, where the probability of participation conditional on that value is 0 J. Angrist and Imbens (1991).

Under monotonicity,

$\begin{aligned} E[D_{1i} - D_{0i} ] = P[D_{1i} > D_{0i}]. \end{aligned}$

34.2.5 Local Average Treatment Effect Theorem

Given Independence, Exclusion, and Monotonicity, we obtain the LATE result (J. D. Angrist and Pischke 2009, 4.4.1):

$\begin{aligned} \frac{E[Y_i | Z_i = 1] - E[Y_i | Z_i = 0]}{E[D_i |Z_i = 1] - E[D_i |Z_i = 0]} = E[Y_{1i} - Y_{0i} | D_{1i} > D_{0i}]. \end{aligned}$

This states that the IV estimator recovers the causal effect only for compliers—units whose treatment status changes due to $Z_i$ .

IV only identifies treatment effects for switchers (compliers):

Switcher Type	Compliance Type	Definition
Switchers	Compliers	$D_{1i} > D_{0i}$ (take treatment if $Z_i = 1$ , not if $Z_i = 0$ )
Non-switchers	Always-Takers	$D_{1i} = D_{0i} = 1$ (always take treatment)
Non-switchers	Never-Takers	$D_{1i} = D_{0i} = 0$ (never take treatment)

IV estimates nothing for always-takers and never-takers since their treatment status is unaffected by $Z_i$ (Similar to the fixed-effects models).

34.2.6 IV in Randomized Trials (Noncompliance)

In randomized trials, if compliance is imperfect (i.e., compliance is voluntary), where individuals in the treatment group will not always take the treatment (e.g., selection bias), intention-to-treat (ITT) estimates are valid but contaminated by noncompliance.
IV estimation using random assignment ( $Z_i$ ) as an instrument for actual treatment received ( $D_i$ ) recovers the LATE.

$\begin{aligned} \frac{E[Y_i |Z_i = 1] - E[Y_i |Z_i = 0]}{E[D_i |Z_i = 1]} = \frac{\text{Intent-to-Treat Effect}}{\text{Compliance Rate}} = E[Y_{1i} - Y_{0i} |D_i = 1]. \end{aligned}$

Under full compliance, LATE = Treatment Effect on the Treated (TOT).

References

Angrist, Joshua D, and Guido W Imbens. 1995. “Two-Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity.” Journal of the American Statistical Association 90 (430): 431–42.

Angrist, Joshua D, and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton university press.

Angrist, Joshua, and Guido Imbens. 1991. “Sources of Identifying Information in Evaluation Models.” National Bureau of Economic Research; National Bureau of Economic Research Cambridge, Mass., USA.

Imbens, Guido W, and Joshua D Angrist. 1994. “Identification and Estimation of Local Average Treatment Effects.” Econometrica 62 (2): 467–75.