1. Randomized Experiments (Week 2)
New York Scholarship Program
The following example comes from Murnane and Willett (2010), chapter 4. The data were originally downloaded from the UCLA Institute for Digital Research and Education.
The data examines a subset of African-American students from the 1997 New York Scholarship Program, a lottery for school vouchers.
Variables:
- s_id: id number
- voucher: recieved a voucher (1=yes, 0=no)
- pre_ach: reading achievement score before getting a voucher
- post_ach: reading achivemenet score at year 3
# load packages
library(tidyverse)
library(haven)
library(psych)
library(apaTables)
library(gt)
source("data/apafunction.R") #to make APA-like tables with `gt`
Descriptive Statistics
describe(nysp_vouchers) %>%
as.data.frame() %>%
rownames_to_column("variable") %>%
mutate(across(is.numeric, round, 3)) %>%
select(-mad, -min, -max, -range, -trimmed) %>%
apa()
variable | vars | n | mean | sd | median | skew | kurtosis | se |
---|---|---|---|---|---|---|---|---|
s_id | 1 | 521 | 10436.484 | 6626.069 | 11141.0 | 0.024 | -1.321 | 290.293 |
voucher | 2 | 521 | 0.559 | 0.497 | 1.0 | -0.235 | -1.948 | 0.022 |
pre_ach | 3 | 521 | 20.164 | 18.255 | 15.5 | 1.256 | 1.347 | 0.800 |
post_ach | 4 | 521 | 23.867 | 19.209 | 18.0 | 1.040 | 0.525 | 0.842 |
#combined
rbind(nysp_vouchers %>%
summarize(voucher = "combined",
obs = n(),
mean = mean(post_ach),
"std. err." = sd(post_ach)/sqrt(n()),
sd = sd(post_ach)),
nysp_vouchers %>%
group_by(voucher) %>%
summarize(obs = n(),
mean = mean(post_ach),
"std. err." = sd(post_ach)/sqrt(n()),
sd = sd(post_ach))) %>%
mutate(across(is.numeric, round, 3)) -> desc
#differences
diff_data <- data.frame(voucher = "difference",
obs = " ",
mean = desc$mean[2] - desc$mean[3],
`std. err.` = sqrt(
(((nysp_vouchers %>% filter(voucher == 1) %>%
pull(post_ach) %>% var()) * (nysp_vouchers %>% filter(voucher == 1) %>% count() %>% pull()-1) +
(nysp_vouchers %>% filter(voucher == 0) %>%
pull(post_ach) %>% var()) * (nysp_vouchers %>% filter(voucher == 0) %>% count() %>% pull()-1)) / (nysp_vouchers %>% count() %>% pull())) *
((1/nysp_vouchers %>% filter(voucher == 1) %>% count() %>% pull())+
(1/nysp_vouchers %>% filter(voucher == 0) %>% count() %>% pull()))
),
sd = "") %>%
rename("std. err." = std..err.)
#make a single table
rbind(desc, diff_data) %>%
mutate(across(is.numeric, round, 3)) %>%
apa()
voucher | obs | mean | std. err. | sd |
---|---|---|---|---|
combined | 521 | 23.867 | 0.842 | 19.209 |
0 | 230 | 21.130 | 1.198 | 18.172 |
1 | 291 | 26.029 | 1.158 | 19.754 |
difference | -4.899 | 1.679 |
Analyses
The following are three different analyses of the impact of voucher receipt (VOUCHER) on the third-grade academic achievement (POST_ACH) for a subsample of 521 African-American children randomly assigned to either a “voucher” treatment or a “no voucher” control group ( n = 521)
T-Test
(Strategy 1, Table 4.1, pg. 49)
mm4_t <- t.test(nysp_vouchers$post_ach ~ nysp_vouchers$voucher, var.equal = T)
broom::tidy(mm4_t) %>%
mutate(across(is.numeric, round, 3),
`std. err` = mm4_t[["stderr"]]) %>%
rename("no voucher" = estimate1,
"voucher" = estimate2) %>%
apa()
no voucher | voucher | statistic | p.value | parameter | conf.low | conf.high | method | alternative | std. err |
---|---|---|---|---|---|---|---|---|---|
21.13 | 26.029 | -2.911 | 0.004 | 519 | -8.205 | -1.593 | Two Sample t-test | two.sided | 1.682719 |
Interpretation
Students who recieved an offer of a voucher had significantly higher achivement scores.
Simple Linear Regression
(Strategy 2, Table 4.1, pg 49)
#model
mm4_model1 <- lm(post_ach ~ voucher, data=nysp_vouchers)
apa.reg.table(mm4_model1)[[3]] %>%
apa()
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
---|---|---|---|---|---|---|---|---|
(Intercept) | 21.13** | [18.66, 23.60] | ||||||
voucher | 4.90** | [1.59, 8.20] | 0.13 | [0.04, 0.21] | .02 | [.00, .04] | .13** | |
R2 = .016** | ||||||||
95% CI[.00,.04] | ||||||||
Output for variances
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
---|---|---|---|---|---|---|---|
(Intercept) | 102693.91 | 1 | 102693.91 | 282.32 | .000 | ||
voucher | 3082.89 | 1 | 3082.89 | 8.48 | .004 | .02 | [.00, .04] |
Error | 188787.59 | 519 | 363.75 |
Interpretation
Students who recieved an offer of a voucher had significantly higher achivement scores (4.9 points higher).
Multiple Linear Regression
(Strategy 3, Table 4.1, pg 49)
mm4_model2 <- lm(post_ach ~ voucher + pre_ach, data = nysp_vouchers)
apa.reg.table(mm4_model2)[[3]] %>% apa()
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
---|---|---|---|---|---|---|---|---|
(Intercept) | 7.72** | [5.43, 10.00] | ||||||
voucher | 4.10** | [1.61, 6.59] | 0.11 | [0.04, 0.17] | .01 | [-.00, .02] | .13** | |
pre_ach | 0.69** | [0.62, 0.76] | 0.65 | [0.59, 0.72] | .43 | [.36, .49] | .66** | |
R2 = .442** | ||||||||
95% CI[.38,.49] | ||||||||
Output for variances
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
---|---|---|---|---|---|---|---|
(Intercept) | 9100.16 | 1 | 9100.16 | 44.05 | .000 | ||
voucher | 2154.80 | 1 | 2154.80 | 10.43 | .001 | .02 | [.00, .04] |
pre_ach | 81780.28 | 1 | 81780.28 | 395.88 | .000 | .43 | [.38, .48] |
Error | 107007.31 | 518 | 206.58 |
Interpretation
Students who recieved an offer of a voucher had significantly higher achivement scores (4.1 points higher) controlling for pre-test scores. The model with the pre_ach covariate accounts for more variability as indicated by the smaller mean square error term of 206.58.
References
Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.