34.9 Special Considerations for Zero-Valued Outcomes

For outcomes that take zero values, log transformations can introduce interpretation issues. Specifically, the coefficient on a log-transformed outcome does not directly represent a percentage change (J. Chen and Roth 2023). We have to distinguish the treatment effect on the intensive (outcome: 10 to 11) vs. extensive margins (outcome: 0 to 1), and we can’t readily interpret the treatment coefficient of log-transformed outcome regression as percentage change. In such cases, researchers use alternative methods:

34.9.1 Proportional LATE Estimation

When dealing with zero-valued outcomes, direct log transformations can lead to interpretation issues. To obtain an interpretable percentage change in the outcome due to treatment among compliers, we estimate the proportional Local Average Treatment Effect (LATE), denoted as \(\theta_{ATE\%}\).

Steps to Estimate Proportional LATE:

  1. Estimate LATE using 2SLS:

    We first estimate the treatment effect using a standard Two-Stage Least Squares regression: \[ Y_i = \beta D_i + X_i + \epsilon_i, \] where:

    • \(D_i\) is the endogenous treatment variable.
    • \(X_i\) includes any exogenous controls.
    • \(\beta\) represents the LATE in levels for the mean of the control group’s compliers.
  2. Estimate the control complier mean (\(\beta_{cc}\)):

    Using the same 2SLS setup, we estimate the control mean for compliers by transforming the outcome variable (Abadie, Angrist, and Imbens 2002): \[ Y_i^{CC} = -(D_i - 1) Y_i. \] The estimated coefficient from this regression, \(\beta_{cc}\), captures the mean outcome for compliers in the control group.

  3. Compute the proportional LATE:

    The estimated proportional LATE is given by: \[ \theta_{ATE\%} = \frac{\hat{\beta}}{\hat{\beta}_{cc}}, \] which provides a direct percentage change interpretation for the outcome among compliers induced by the instrument.

  4. Obtain standard errors via non-parametric bootstrap:

    Since \(\theta_{ATE\%}\) is a ratio of estimated coefficients, standard errors are best obtained using non-parametric bootstrap methods.

  5. Special case: Binary instrument

    If the instrument is binary, \(\theta_{ATE\%}\) for the intensive margin of compliers can be directly estimated using Poisson IV regression (ivpoisson in Stata).

34.9.2 Bounds on Intensive-Margin Effects

Lee (2009) proposed a bounding approach for intensive-margin effects, assuming that compliers always have positive outcomes regardless of treatment (i.e., intensive-margin effect). These bounds help estimate treatment effects without relying on log transformations. However, this requires a monotonicity assumption for compliers where they should still have positive outcome regardless of treatment status.

References

Abadie, Alberto, Joshua Angrist, and Guido Imbens. 2002. “Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings.” Econometrica 70 (1): 91–117.
Chen, Jiafeng, and Jonathan Roth. 2023. “Logs with Zeros? Some Problems and Solutions.” The Quarterly Journal of Economics, qjad054.
Lee, David S. 2009. “Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects.” The Review of Economic Studies, 1071–1102.