Chapter 11 Experiments: Dealing with Real–World Challenges

We will learn to assess balance with R in this chapter. We need the following libraries

library(tidyverse)
library(broom)

11.1 Assess Balance

Let’s use the ProgramEffectiveness data set from the AER package to assess balance. The ProgramEffectiveness data set contains 32 observations on four variables31. The data are used to examine whether a new method of teaching economics improved performance in later economics courses. The variables are grade coded as a factor with levels “increase” and “decrease”, average (grade point average), testscore (test score on economics test), and participation coded as a factor with levels “no” and “yes”. participation is the treatment in this case. We assess the balance below:

library(AER)
data("ProgramEffectiveness")
ProgramEffectiveness %$%
  lm(average ~ participation) %>% 
  tidy()
# A tibble: 2 x 5
  term             estimate std.error statistic  p.value
  <chr>               <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)        3.10       0.112    27.8   5.97e-23
2 participationyes   0.0367     0.169     0.218 8.29e- 1
ProgramEffectiveness %$%
  lm(testscore ~ participation) %>% 
  tidy()
# A tibble: 2 x 5
  term             estimate std.error statistic  p.value
  <chr>               <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)        21.6       0.929    23.2   1.01e-20
2 participationyes    0.873     1.40      0.622 5.39e- 1

For each variable, we can conclude that the treatment is balanced.

11.2 Estimate ITT Model

We estimate the ITT model below:

ProgramEffectiveness %$%
  lm(as.numeric(grade) ~ participation) %>% 
  summary()

Call:
lm(formula = as.numeric(grade) ~ participation)

Residuals:
   Min     1Q Median     3Q    Max 
-0.571 -0.167 -0.167  0.429  0.833 

Coefficients:
                 Estimate Std. Error t value        Pr(>|t|)    
(Intercept)         1.167      0.105   11.13 0.0000000000035 ***
participationyes    0.405      0.158    2.56           0.016 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.445 on 30 degrees of freedom
Multiple R-squared:  0.179,	Adjusted R-squared:  0.151 
F-statistic: 6.53 on 1 and 30 DF,  p-value: 0.0159

We can reject the null hypothesis of no effect and conclude that participation increased the test score on later tests.


  1. ?AER::ProgramEffectiveness for more information↩︎