6. Instrumental-Variables Estimation

Key Concepts

  • Instrumental-variable estimation (IVE) is used to “carve out” the exogenous part of the variability in a endogenous predictor.
    • An instrumental variable (IV) should be related to the predictor
    • An IV should not be related to the residuals of the outcome
      • There should be no direct path from instrument to outcome. The path should always be through the predictor.
  • IVE is less precise and thus larger standard errors are often obtained
    • It is recommend to use robust standard error estimation
  • IVs can be estimated with two-stage least squares (2SLS) or simultaneous equation modeling (SEM).

Methods Matter, Chapter 10

The following example comes from Murnane and Willett (2010), chapter 10.

Education and Civic Engagement

Dee’s (2004) study looks at the causal impact of educational attainment on civic participation, specifically registering to vote

Variables:

  • schoolid: a random school ID
  • hispanic: 1 = yes | 0 = no
  • college: attended a junior, community or 4-year college by 1984 (1 = yes | 0 = no)
  • black: 1 = yes | 0 = no
  • otherrace: 1 = yes | 0 = no
  • female: 1 = yes | 0 = no
  • register: currently registered to vote (1 = yes | 0 = no)
  • distance: miles from participant’s high school to the nearest 2-year college

Descriptives

From Table 10.1 on page 207.

Is respondent currently registered to vote?
n mean sd variance skewness kurtosis
9227 0.671 0.47 0.221 -0.727 -1.472
Attended junior, community or 4year college by 1984?
n mean sd variance skewness kurtosis
9227 0.547 0.498 0.248 -0.189 -1.964
Crosstab overview
register college count
0 0 1780
0 1 1257
1 0 2399
1 1 3791

Tests of a Valid Instrument

To be a valid instrument, it must meet three criteria:

  • Relevance: Instrument is correlated with question predictor, college
  • Exclusion: Instrument is correlated with outcome only through the college variable
  • Exogeneity: Instrument isn’t correlated with anything else in the model (i.e. omitted variables)
    • This criteria is met through theory/substantive knowledge

Relevance

Correlate outcome, predictor, and instrument:

Correlation between COLLEGE (predictor), DISTANCE (IV), and REGISTER (outcome)
variable register distance
register
distance -0.03**
college 0.19**** -0.11****

Interpretations

  • distance (the instrument) has a small but positive relationship with the question predictor college.
  • An instrumental variable must share variation with the question predictor.
  • distance has a very small, negative correlation with register (the outcome).
  • An instrumental variable should not be correlated with the outcome or its residuals. Here, there is a very weak correlation. The relationship with the residuals will still need to be tested.
Another test (OLS regression) of the relationship between instrument and outcome
Model 1
(Intercept) 0.688
(0.007)
distance -0.002
(0.001)
Num.Obs. 9227
R2 0.001

Interpretation

distance is a poor predictor of register. It has a negative and non-significant coefficient.

Exclusion

Do we test this by correlating DISTANCE with model residuals?

Naive OLS Regression Model

Outcome=register from Table 10.1 on page 207.
term estimate std.error statistic p.value
(Intercept) 0.574 0.007 80.391 0
college 0.177 0.010 18.326 0
Predictor SS df MS F p partial_eta2 CI_90_partial_eta2
(Intercept) 1377.17 1 1377.17 6462.64 .000
college 71.57 1 71.57 335.86 .000 .04 [.03, .04]
Error 1965.82 9225 0.21

Two-Stage Least Squares

Method One: Using lm

Following an example from https://evalf19.classes.andrewheiss.com/class/11-class/

1st Stage: Outcome = COLLEGE
term estimate std.error statistic p.value
(Intercept) 0.609 0.008 78.812 0
distance -0.006 0.001 -10.764 0
2nd Stage: Outcome = REGISTER
term estimate std.error statistic p.value
(Intercept) 0.516 0.049 10.632 0.000
college_predicted 0.284 0.088 3.216 0.001
## # A tibble: 1 x 11
##   r.squared adj.r.squared sigma statistic p.value    df logLik
##       <dbl>         <dbl> <dbl>     <dbl>   <dbl> <int>  <dbl>
## 1   0.00112       0.00101 0.470      10.3 0.00130     2 -6119.
## # ... with 4 more variables: AIC <dbl>, BIC <dbl>,
## #   deviance <dbl>, df.residual <int>

Method Two: Using ivreg

Following instructions from https://rpubs.com/wsundstrom/t_ivreg

2SLS using ivreg
term estimate std.error statistic p.value
(Intercept) 0.516 0.048 10.747 0.000
college 0.284 0.087 3.251 0.001

Simultaneous Equations Modeling - systemfit

Using systemfit. See https://cran.r-project.org/web/packages/systemfit/vignettes/systemfit.pdf

## 
## systemfit results 
## method: 2SLS 
## 
##            N    DF     SSR  detRCov   OLS-R2 McElroy-R2
## system 18454 18450 4249.81 0.052149 0.017085   0.050886
## 
##        N   DF     SSR      MSE     RMSE       R2   Adj R2
## eq1 9227 9225 2257.93 0.244762 0.494734 0.012404 0.012297
## eq2 9227 9225 1991.88 0.215922 0.464674 0.022338 0.022232
## 
## The covariance matrix of the residuals
##            eq1        eq2
## eq1  0.2447621 -0.0264594
## eq2 -0.0264594  0.2159222
## 
## The correlations of the residuals
##           eq1       eq2
## eq1  1.000000 -0.115096
## eq2 -0.115096  1.000000
## 
## 
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: college ~ distance
## Instruments: ~distance
## 
##                 Estimate   Std. Error  t value   Pr(>|t|)    
## (Intercept)  0.609117789  0.007728734  78.8121 < 2.22e-16 ***
## distance    -0.006370971  0.000591878 -10.7640 < 2.22e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.494734 on 9225 degrees of freedom
## Number of observations: 9227 Degrees of Freedom: 9225 
## SSR: 2257.930336 MSE: 0.244762 Root MSE: 0.494734 
## Multiple R-Squared: 0.012404 Adjusted R-Squared: 0.012297 
## 
## 
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: register ~ college
## Instruments: ~distance
## 
##              Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) 0.5156526  0.0479822 10.74675 < 2.22e-16 ***
## college     0.2836913  0.0872575  3.25119  0.0011533 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.464674 on 9225 degrees of freedom
## Number of observations: 9227 Degrees of Freedom: 9225 
## SSR: 1991.882429 MSE: 0.215922 Root MSE: 0.464674 
## Multiple R-Squared: 0.022338 Adjusted R-Squared: 0.022232

The book never talks about 3SLS, but the Methods Matter includes it. However, I only see two stages. In addition, changing the method to 3SLS does not change the model

Structural Equation Modeling method

## lavaan 0.6-6 ended normally after 22 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                          5
##                                                       
##   Number of observations                          9227
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   college ~                                           
##     distance         -0.006    0.001  -10.765    0.000
##   register ~                                          
##     college           0.284    0.087    3.252    0.001
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .college ~~                                          
##    .register         -0.026    0.021   -1.231    0.218
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .college           0.245    0.004   67.923    0.000
##    .register          0.216    0.006   38.519    0.000
## 
## R-Square:
##                    Estimate
##     college           0.012
##     register          0.022
##          colleg regstr distnc
## college   0.248              
## register  0.044  0.221       
## distance -0.482 -0.137 75.722

Side-by-Side Comparison

Note: SEM models not added to the following table.

Side-by-Side Comparison of IVE Methods
Naive First Stage Second Stage 2SLS with ivreg
(Intercept) 0.574*** 0.609*** 0.516*** 0.516***
(0.007) (0.008) (0.049) (0.048)
college 0.177*** 0.284***
(0.010) (0.087)
distance -0.006***
(0.001)
college_predicted 0.284***
(0.088)
Num.Obs. 9227 9227 9227 9227
R2 0.035 0.012 0.001 0.022
R2 Adj. 0.035 0.012 0.001 0.022
AIC 11924.1 13202.4 12243.7 12045.6
BIC 11945.5 13223.8 12265.1 12067.0
Log.Lik. -5959.047 -6598.189 -6118.855 -6019.802
* p < 0.1, ** p < 0.05, *** p < 0.01
> Note While coefficients generally math between second stage regression and ivreg, ivreg \(R^2\) is larger and matches SEM methods.

Diagnostics

Recall that the instrument should not be correlated with the residuals of the outcome.

Correlation between DISTANCE (IV) and REGISTER residuals (outcome)
variable distance
distance
register_residuals 0.00

IVE with Covariates

First and Second Stage Least Squares Regressions - IVE with Covariates
Stage 1 Stage 2
(Intercept) 0.643*** 0.527***
(0.009) (0.047)
black -0.058*** 0.062***
(0.016) (0.015)
distance -0.007***
(0.001)
hispanic -0.116*** 0.028*
(0.013) (0.015)
otherrace 0.034 -0.107***
(0.024) (0.023)
college_predicted 0.249***
(0.082)
Num.Obs. 9227 9227
R2 0.022 0.005
R2 Adj. 0.021 0.004
AIC 13120.7 12218.0
BIC 13163.5 12260.8
Log.Lik. -6554.367 -6103.007
* p < 0.1, ** p < 0.05, *** p < 0.01

IVE with Covariates and Interactions

First and Second Stage Least Squares Regressions - IVE with Covariates and Interactions
Stage 1 Stage 2
(Intercept) 0.645*** 0.464***
(0.010) (0.056)
black -0.065*** 0.278*
(0.023) (0.163)
distance -0.007***
(0.001)
distance:black 0.001
(0.002)
distance:hispanic 0.001
(0.002)
distance:otherrace -0.003
(0.003)
hispanic -0.128*** 0.177
(0.019) (0.121)
otherrace 0.059* 0.174
(0.035) (0.176)
college_predicted 0.359***
(0.097)
college_predicted:black -0.399
(0.303)
college_predicted:hispanic -0.293
(0.249)
college_predicted:otherrace -0.463
(0.285)
Num.Obs. 9227 9227
R2 0.022 0.005
R2 Adj. 0.021 0.004
AIC 13124.7 12219.3
BIC 13188.8 12283.5
Log.Lik. -6553.329 -6100.645
* p < 0.1, ** p < 0.05, *** p < 0.01
> The book and UCLA performs separate First Stages for each interaction (ie. outcome: college x black ~ black, distance x black). Is this necessary? The second stage above gets the same results.

Methods Matter, Chapter 11

Chapter 11 extends IVE to situations in which perfect compliance with treatment conditions is not possible. It uses the original assignent as an instrument and determines the local average treatment effect (LATE), which estimates the impact for those who complied with the assignment (or promotion). It can be used to attempt to estimate the impact not of the offer (or ITT: intention-to-treat) but participation in the program (or TOT: teatement-on-the-treated).

The following example comes from Murnane and Willett (2010), chapter 11.

PACES Colombian Scholarship Program

PACES offered scholarships to students living in low-income neighborhoods to help pay for education at private secondary schools. However, there were some in the sample who did not win the lottery yet still recieved financial aid of some form and enrolled in provate schools.

Variables:

  • id: a random ID
  • won_lottery: 1 = yes | 0 = no
  • male: 1 = yes | 0 = no)
  • base_age
  • finish8th: 1 = yes | 0 = no
  • use_fin_aid: 1 = yes | 0 = no

Descriptives

Overall:

Descriptive statistics for Table 11.1 on page 270.
variable n mean sd min max
won_lottery 1171 0.506 0.500 0 1
male 1171 0.505 0.500 0 1
base_age 1171 12.004 1.347 7 17
finish8th 1171 0.681 0.466 0 1
use_fin_aid 1171 0.582 0.494 0 1

Reproducing Table 11.1, p. 270:

Sample means on the outcome variable, question predictor, instrument, and covariates, for a sample of students from Bogota, Colombia, who participated in the 1995 lottery to obtain a government-funded private-school tuition scholarship, overall and by whether the child was offered financial aid
Variable Sample Mean WON_LOTTERY=1 WON_LOTTERY=2
Outcome: FINISH8TH 0.681 0.6252159 0.7364865
Endogenous Question Predictor: USE_FIN_AID 0.582 0.2400691 0.9155405
Instrument: WON_LOTTERY 0.506
Covariate: BASE_AGE 12.004 12.03627 11.97297
Covariate: MALE 0.505 0.5043178 0.5050676

What kind of T-test was done here?

Assumption Checks

Testing the relevance of the instrument:

Correlation between use_fin_aid (predictor), won_lottery (IV), finish8th (outcome), and male and base_age (covariates)
variable won_lottery finish8th use_fin_aid male
won_lottery
finish8th 0.12****
use_fin_aid 0.68**** 0.14****
male 0.00 -0.11*** -0.02
base_age -0.02 -0.20**** -0.06* 0.08**
> IV has low correlation to outcome. Is that OK? p < .0001

Naive and 2SLS Regressions

Naive First Stage Second Stage
(Intercept) 1.512*** 0.433*** 1.378***
(0.119) (0.095) (0.123)
base_age -0.066*** -0.015* -0.062***
(0.010) (0.008) (0.010)
male -0.088*** -0.020 -0.085***
(0.027) (0.021) (0.027)
won_lottery 0.675***
(0.021)
use_fin_aid_predicted 0.159***
(0.039)
Num.Obs. 1171 1171 1171
R2 0.048 0.471 0.061
R2 Adj. 0.046 0.470 0.058
AIC 1485.4 932.7 1471.0
BIC 1505.6 958.0 1496.3
Log.Lik. -738.692 -461.356 -730.497
F 29.141 346.263 25.167
* p < 0.1, ** p < 0.05, *** p < 0.01

Diagnostics

Correlation between WON_LOTTERY (instrument) and FINISH8th residuals (outcome)
variable won_lottery
won_lottery
finish8th_residuals 0.00

Impact Evaluation, Chapter 5: Estimation of Intent-to-Treat and Local Average Treatment Effect in the Presence of Noncompliance

In this context, the program is randomized at the village level. While everyone is eligible for the program in treatment communities, not everyone participates.

Intent-to-Treat Effect - Naive Model

Naive Model
Model 1
(Intercept) 20.064
(0.163)
treatment_locality -6.406
(0.230)
Num.Obs. 9914
R2 0.073

Interpretations

The estimate for the regression coefficient (δ) is -6.4, indicating that households in villages where HISP was offered on average spent $6.4 less on health expenditures than households in villages where HISP was not offered.

Recall that there has been noncompliance and thus this ITT estimate is not accurate.

Local Average Treatment Effect - 2SLS IV Model

Here, the instrumental variable (treatment_locality) takes the value of 1 if HISP was randomly offered to households in a given locality, and 0 otherwise.

Correlation between instrument, predictor, and outcome
variable health_expenditures enrolled
health_expenditures
enrolled -0.50****
treatment_locality -0.27**** 0.65****
First Stage Second Stage
(Intercept) 0.000 20.064***
(0.005) (0.163)
treatment_locality 0.598***
(0.007)
enrolled_pred -10.716***
(0.385)
Num.Obs. 9914 9914
R2 0.426 0.073
R2 Adj. 0.426 0.072
AIC 7144.5 76481.6
BIC 7166.1 76503.2
Log.Lik. -3569.261 -38237.791
F 7361.226 775.617
* p < 0.1, ** p < 0.05, *** p < 0.01

Interpretation

From Impact Evaluation: “The coefficient, 0.598 indicates that approximately 59.8% of households enrolled in HISP when the program was offered in their locality. The second stage regression uses the predicted enrollment from the first stage as a regressor to explain variation in the outcomes of interest. The estimated coefficient suggests that participation in the HISP program lowers health expenditures by $10.7.”

Impact Evaluation, Chapter 5: Instrumental Variables and Randomized Promotion

In this context, everyone is eligible for the program. You compare what happens in promoted and non-promoted villages.

## Warning: package 'estimatr' was built under R version 4.0.2
First Stage Second Stage 2SLS with Robust Standard Errors
(Intercept) 0.084*** 19.646*** 19.646***
(0.006) (0.206) (0.181)
promotion_locality 0.408***
(0.008)
enrolled_pred -9.500***
(0.578)
enrolled_rp -9.500***
(0.516)
Num.Obs. 9914 9914 9914
R2 0.200 0.027 0.222
R2 Adj. 0.200 0.026 0.222
AIC 10321.9 76962.0
BIC 10343.5 76983.6
Log.Lik. -5157.944 -38478.007
F 2484.604 270.044 338.527
N 9914
p.value.endogeneity
p.value.overid
p.value.weakinst
se_type HC2
statistic.endogeneity
statistic.overid
statistic.weakinst
* p < 0.1, ** p < 0.05, *** p < 0.01

Is HC2 the correct SE? What are the differences?

Interpretation

From Impact Evaluation: “The first stage identifies the effects of the promotion activities on program take-up. In this case, promotion activities increase program take-up by 40.8 percent. In the second stage, we regress the outcome variable on the predicted program participation from the first stage to obtain the LATE estimates. In this case, the results suggest that participation in the HISP program lowers health expenditures by $9.5.”

Diagnostics

Correlation between instrument) and residuals
variable treatment_locality
treatment_locality
impact_ive_residuals -0.04****

Additional resources:

https://evalf19.classes.andrewheiss.com/class/11-class/

https://evalf19.classes.andrewheiss.com/class/12-class/

References

Gertler, P. J., Martinez, S., Premand, P., Rawlings, L. B., & Vermeersch, C. M. (2016). Impact evaluation in practice. The World Bank.

Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.