2. Statistical Power and Sample Size (Week 3)
# load packages
library(tidyverse)
library(haven)
library(apaTables)
source("data/apafunction.R") #for APA-style tables
#load data
nysp_vouchers <- read_dta("data/methods_matter/ch4_nyvoucher.dta")
Effect Size for T-Test
T-Test Results
T-Test Results | ||||||
---|---|---|---|---|---|---|
no voucher | voucher | statistic | p.value | conf.low | conf.high | std. err |
21.13 | 26.029 | -2.911 | 0.004 | -8.205 | -1.593 | 1.682719 |
Cohen’s d effect size for t-test
Using the effectsize
package:
library(effectsize)
# use cohens_d() from effectsize
cohens_d(nysp_vouchers$post_ach, as.factor(nysp_vouchers$voucher)) %>%
# format into an APA table
mutate(across(is.numeric, round, 3)) %>% #round
mutate("95% CI" = paste0("[", CI_low, ", ", CI_high, "]")) %>% # combine
select(Cohens_d, `95% CI`) %>% # drop columns
rename("Cohen's d"= Cohens_d) %>% # rename
apa("Effect size for t-test")
Effect size for t-test | |
---|---|
Cohen's d | 95% CI |
-0.257 | [-0.43, -0.083] |
Interpretation
Students with an opportunity to recieve a voicher scored .25 standard deviations higher than students who did not recieve an opportunity to get a voucher.
Calculating Power
Power calculations are based on the pwr
package and come from https://www.statmethods.net/stats/power.html.
New York Scholarship Program (NYSP) Power Analysis
T-Tests
The following calculates power from the NYSP t-test example (Strategy 1, Table 4.1, pg. 49)
Power of the NYSP Voucher T-Test | ||||
---|---|---|---|---|
n1 | n2 | d | sig | power |
230 | 291 | 0.257 | 0.05 | 0.8283176 |
Interpretation
This is a post-hoc power analysis. The study above had a power of .82. That is, it had an 82% chance to detect and effect if there was once and there was a 18% chance of making a type II error (rejecting a null hypothesis when there is an effect).
Simple Linear Regression
Recall Strategy 2, Table 4.1, pg. 49):
Simple Linear Regression | ||||||||
---|---|---|---|---|---|---|---|---|
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
(Intercept) | 21.13** | [18.66, 23.60] | ||||||
voucher | 4.90** | [1.59, 8.20] | 0.13 | [0.04, 0.21] | .02 | [.00, .04] | .13** | |
R2 = .016** | ||||||||
95% CI[.00,.04] | ||||||||
ANOVA Table for Simple Linear Regresion | |||||||
---|---|---|---|---|---|---|---|
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
(Intercept) | 102693.91 | 1 | 102693.91 | 282.32 | .000 | ||
voucher | 3082.89 | 1 | 3082.89 | 8.48 | .004 | .02 | [.00, .04] |
Error | 188787.59 | 519 | 363.75 |
Power for NYSP Simple Linear Regression
Use pwr.f2.test(u =, v = , f2 = , sig.level = , power = )
where,
- u = numerator or df of predictors (e.g. number of predictors including each dummy variable)
- v = denominator or df for the residual
- f2 = Cohen’s , which is equal to
Based on the regression results, the NYSP simple linear regression model had the following power:
Power of the NYSP Voucher Simple Linear Regression Test | ||||
---|---|---|---|---|
Predictors | Residual df | r2 | sig | power |
1 | 519 | 0.01626016 | 0.05 | 0.8277319 |
Interpretation
Because no covariates were used, the results here are the same as the t-test above.
Power for NYSP Multiple Regression
(Strategy 3, Table 4.1, pg 49)
mm4_model2 <- lm(post_ach ~ voucher + pre_ach, data = nysp_vouchers)
apa.reg.table(mm4_model2)[[3]] %>% apa()
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
---|---|---|---|---|---|---|---|---|
(Intercept) | 7.72** | [5.43, 10.00] | ||||||
voucher | 4.10** | [1.61, 6.59] | 0.11 | [0.04, 0.17] | .01 | [-.00, .02] | .13** | |
pre_ach | 0.69** | [0.62, 0.76] | 0.65 | [0.59, 0.72] | .43 | [.36, .49] | .66** | |
R2 = .442** | ||||||||
95% CI[.38,.49] | ||||||||
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
---|---|---|---|---|---|---|---|
(Intercept) | 9100.16 | 1 | 9100.16 | 44.05 | .000 | ||
voucher | 2154.80 | 1 | 2154.80 | 10.43 | .001 | .02 | [.00, .04] |
pre_ach | 81780.28 | 1 | 81780.28 | 395.88 | .000 | .43 | [.38, .48] |
Error | 107007.31 | 518 | 206.58 |
Based on the regression results, the NYSP multiple regression model had the following power:
Power of the NYSP Voucher Multiple Regression Test | ||||
---|---|---|---|---|
Predictors | Residual df | r2 | sig | power |
2 | 518 | 0.7921147 | 0.05 | 1 |
Interpretation
The post-hoc test of power indicated that the large sample size and large had a power of 1, or approximately a 100% chance to detect an effect if there was one.
Effect Size Calculator
Here is a quick interactive calculator I made. It’s very basic.
Accuracy in Parameter Estimation (AIPE)
AIPE is another method which can be used to calculate estimated sample size. It is based on specifying a confidence interval in which you would find an effect size of interest. Here is an example based on the NYSP multiple regression using the MBESS
package:
library(MBESS)
ss.aipe.R2(Population.R2 = .442, conf.level = .95, width=.10, p=2, Random.Predictors = FALSE)
## [1] "The approximate sample size is given below; you should consider using the additional"
## [1] "argument 'verify.ss=TRUE' to ensure the exact sample size value is obtained."
## $Required.Sample.Size
## [1] 661
To find an of .442, you would need the sample size indicated above (661). The actual sample size that found the of .442 was 520. The estimate was not exact, but was very close.
References
Kabacoff, R. I. (2017). Power analysis. Quick-R. https://www.statmethods.net/stats/power.html
Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.