2. Statistical Power and Sample Size (Week 3)
# load packages
library(tidyverse)
library(haven)
library(apaTables)
source("data/apafunction.R") #for APA-style tables
#load data
nysp_vouchers <- read_dta("data/methods_matter/ch4_nyvoucher.dta")
Effect Size for T-Test
T-Test Results
T-Test Results | ||||||
---|---|---|---|---|---|---|
no voucher | voucher | statistic | p.value | conf.low | conf.high | std. err |
21.13 | 26.029 | -2.911 | 0.004 | -8.205 | -1.593 | 1.682719 |
Cohen’s d effect size for t-test
Using the effectsize
package:
library(effectsize)
# use cohens_d() from effectsize
cohens_d(nysp_vouchers$post_ach, as.factor(nysp_vouchers$voucher)) %>%
# format into an APA table
mutate(across(is.numeric, round, 3)) %>% #round
mutate("95% CI" = paste0("[", CI_low, ", ", CI_high, "]")) %>% # combine
select(Cohens_d, `95% CI`) %>% # drop columns
rename("Cohen's d"= Cohens_d) %>% # rename
apa("Effect size for t-test")
Effect size for t-test | |
---|---|
Cohen's d | 95% CI |
-0.257 | [-0.43, -0.083] |
Interpretation
Students with an opportunity to recieve a voicher scored .25 standard deviations higher than students who did not recieve an opportunity to get a voucher.
Calculating Power
Power calculations are based on the pwr
package and come from https://www.statmethods.net/stats/power.html.
New York Scholarship Program (NYSP) Power Analysis
T-Tests
The following calculates power from the NYSP t-test example (Strategy 1, Table 4.1, pg. 49)
Power of the NYSP Voucher T-Test | ||||
---|---|---|---|---|
n1 | n2 | d | sig | power |
230 | 291 | 0.257 | 0.05 | 0.8283176 |
Interpretation
This is a post-hoc power analysis. The study above had a power of .82. That is, it had an 82% chance to detect and effect if there was once and there was a 18% chance of making a type II error (rejecting a null hypothesis when there is an effect).
Simple Linear Regression
Recall Strategy 2, Table 4.1, pg. 49):
Simple Linear Regression | ||||||||
---|---|---|---|---|---|---|---|---|
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
(Intercept) | 21.13** | [18.66, 23.60] | ||||||
voucher | 4.90** | [1.59, 8.20] | 0.13 | [0.04, 0.21] | .02 | [.00, .04] | .13** | |
R2 = .016** | ||||||||
95% CI[.00,.04] | ||||||||
ANOVA Table for Simple Linear Regresion | |||||||
---|---|---|---|---|---|---|---|
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
(Intercept) | 102693.91 | 1 | 102693.91 | 282.32 | .000 | ||
voucher | 3082.89 | 1 | 3082.89 | 8.48 | .004 | .02 | [.00, .04] |
Error | 188787.59 | 519 | 363.75 |
Power for NYSP Simple Linear Regression
Use pwr.f2.test(u =, v = , f2 = , sig.level = , power = )
where,
- u = numerator or df of predictors (e.g. number of predictors including each dummy variable)
- v = denominator or df for the residual
- f2 = Cohen’s \(f^2\), which is equal to \(\frac{R^2}{1-R^2}\)
Based on the regression results, the NYSP simple linear regression model had the following power:
Power of the NYSP Voucher Simple Linear Regression Test | ||||
---|---|---|---|---|
Predictors | Residual df | r2 | sig | power |
1 | 519 | 0.01626016 | 0.05 | 0.8277319 |
Interpretation
Because no covariates were used, the results here are the same as the t-test above.
Power for NYSP Multiple Regression
(Strategy 3, Table 4.1, pg 49)
mm4_model2 <- lm(post_ach ~ voucher + pre_ach, data = nysp_vouchers)
apa.reg.table(mm4_model2)[[3]] %>% apa()
Predictor | b | b_95%_CI | beta | beta_95%_CI | sr2 | sr2_95%_CI | r | Fit |
---|---|---|---|---|---|---|---|---|
(Intercept) | 7.72** | [5.43, 10.00] | ||||||
voucher | 4.10** | [1.61, 6.59] | 0.11 | [0.04, 0.17] | .01 | [-.00, .02] | .13** | |
pre_ach | 0.69** | [0.62, 0.76] | 0.65 | [0.59, 0.72] | .43 | [.36, .49] | .66** | |
R2 = .442** | ||||||||
95% CI[.38,.49] | ||||||||
Predictor | SS | df | MS | F | p | partial_eta2 | CI_90_partial_eta2 |
---|---|---|---|---|---|---|---|
(Intercept) | 9100.16 | 1 | 9100.16 | 44.05 | .000 | ||
voucher | 2154.80 | 1 | 2154.80 | 10.43 | .001 | .02 | [.00, .04] |
pre_ach | 81780.28 | 1 | 81780.28 | 395.88 | .000 | .43 | [.38, .48] |
Error | 107007.31 | 518 | 206.58 |
Based on the regression results, the NYSP multiple regression model had the following power:
Power of the NYSP Voucher Multiple Regression Test | ||||
---|---|---|---|---|
Predictors | Residual df | r2 | sig | power |
2 | 518 | 0.7921147 | 0.05 | 1 |
Interpretation
The post-hoc test of power indicated that the large sample size and large \(R^2\) had a power of 1, or approximately a 100% chance to detect an effect if there was one.
Effect Size Calculator
Here is a quick interactive calculator I made. It’s very basic.
Accuracy in Parameter Estimation (AIPE)
AIPE is another method which can be used to calculate estimated sample size. It is based on specifying a confidence interval in which you would find an effect size of interest. Here is an example based on the NYSP multiple regression using the MBESS
package:
library(MBESS)
ss.aipe.R2(Population.R2 = .442, conf.level = .95, width=.10, p=2, Random.Predictors = FALSE)
## [1] "The approximate sample size is given below; you should consider using the additional"
## [1] "argument 'verify.ss=TRUE' to ensure the exact sample size value is obtained."
## $Required.Sample.Size
## [1] 661
To find an \(R^2\) of .442\({_{CI}}_{[.3-.5]}\), you would need the sample size indicated above (661). The actual sample size that found the \(R^2\) of .442 was 520. The estimate was not exact, but was very close.
References
Kabacoff, R. I. (2017). Power analysis. Quick-R. https://www.statmethods.net/stats/power.html
Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research. Oxford University Press.