9.2 P-Curve

In the last chapter, we showed you how you can apply Egger’s test of the intercept, Duval & Tweedie’s trim and fill procedure, and inspect Funnel plots in R.

As we have mentioned before, recent research has shown that the assumptions of the small-effect study methods may be inaccurate in many cases. The Duval & Tweedie trim-and-fill procedure in particular has been shown to be prone to providing inaccurate effect size estimates (Simonsohn, Nelson, and Simmons 2014a).

P-curve Analysis has been proposed as an alternative way to assess publication bias and estimate the true effect behind our collected data.

P-Curve assumes that publication bias is not primarily generated because researchers do not publish non-significant results, but because the “play” around with their data (e.g., selectively removing outliers, choosing different outcomes, controlling for different variables) until a non-significant finding becomes significant. This (bad) practice is called p-hacking, and has been shown to be extremely frequent among researchers (Head et al. 2015).

The idea behind P-Curve

9.2.1 Preparing RStudio and our data

To use P-curve, we need our data in the same format as the one we used for the metacont function in Chapter 3.1.1. We need to specify Me, Se, Ne, Mc, Sc, and Nc. My metacont data already has this format.

Author Ne Me Se Nc Mc Sc
Cavanagh 50 4.5 2.7 50 5.6 2.6
Day 64 18.3 6.4 65 20.2 7.6
Frazier 90 12.5 3.2 95 15.5 4.4
Gaffney 30 2.34 0.87 30 3.13 1.234
Greer 77 15.212 5.35 69 20.13 7.43
Harrer 60 3.153 1.256 60 3.4213 1.878

To use p-curve, we also have to install and load four packages. The esc package (Lüdecke 2018), the compute.es package (Del Re 2013), the stringr package, and the poibin package (Hong 2011).

library(compute.es)
library(esc)
library(stringr)
library(poibin)

To prepare our data which we stored in the format described above, we need to use the pcurve_dataprep function, which we prepared for you.

Again, R doesn’t know this function yet, so we have to let R learn it by copying and pasting the code underneath in its entirety into the console on the bottom left pane of RStudio, and then hit Enter ⏎.

pcurve_dataprep<-function(data){
data<-data
Me<-as.numeric(data$Me)
Se<-as.numeric(data$Se)
Ne<-as.numeric(data$Ne)
Mc<-as.numeric(data$Mc)
Sc<-as.numeric(data$Sc)
Nc<-as.numeric(data$Nc)

esc<-esc_mean_sd(grp1m=Me, 
                 grp1sd=Se, 
                 grp1n=Ne, 
                 grp2m=Mc, 
                 grp2sd=Sc, 
                 grp2n=Nc, 
                 es.type = "d")

output<-des(d=esc$es,n.1=Ne,n.2=Nc, verbose = FALSE)
output$r<-abs(output$r)
tot<-data.frame(paste("r(",output$N.total-2,")=",output$r))
colnames(tot)<-c("output")
tot$output<-gsub(" ", "", tot$output, fixed = TRUE)
totoutput<-as.character(tot$output)
print(tot, row.names = FALSE)
write(totoutput,ncolumns=1, file="input.txt")
}

To use the pcurve_dataprep function, we simply have to specify our dataset. In my case, this is metacont.

pcurve_dataprep(data=metacont)
##       output
##    r(98)=0.2
##  r(127)=0.13
##  r(183)=0.36
##   r(58)=0.35
##  r(144)=0.36
##  r(118)=0.08

The function gives us the correct format of our effect sizes (t-values) which we need to conduct the p-curve analysis.

The function also automatically stores a .txt-File (input.txt) in your working directory folder on your computer.

If you forgot where your working directory is, use the getwd function.

getwd()
## [1] "/Users/Mathias2/Documents/R/WORKING_DIRECTORY/Windows/R/WORKING_DIRECTORY/Meta-Analyse Buch/bookdown-demo-master"

This tells you the path to your working directory, where the input.txt file should be stored.

9.2.2 Creating P-Curves

To create p-curves, we have to use the pcurve_app function. The code for this function is very long, so it is not displayed here. The code for the function has been made publically available online by Uri Simonsohn, Leif Nelson, and Joseph Simmons and can be found here.

Again, R doesn’t know this function yet, so we have to let R learn it by copying and pasting the code underneath in its entirety into the console on the bottom left pane of RStudio, and then hit Enter ⏎.

To use the pcurve_app function, we have to specify that the function should use the input.txt in which the data we created using the pcurve_dataprep function is stored. We also have to provide the function with the path to our folder/working directory where the file is stored.

Please not that while the standard way that Windows provides you with paths is using backslashes, we need to use normal slashes (/) for our designated path.

pcurve_app("input.txt","C://Users/Admin/Documents/R/WORKING_DIRECTORY/Meta-Analyse Buch/bookdown-demo-master")

The function automatically creates and stores several files and figures into our folder/working directory. Here the most important ones:



Figure 1: “input.png”

This figure shows you the p-curve for your results (in blue). On the bottom, you can also find the number of effect sizes with \(p<0.05\) which were included in the analysis. There are two tests displayed in the plot.

The test for right-skewness

If there is evidential value behind our data, the p-curve should be right-skewed. Through eyeballing, we see that this is pretty much the case here, and the tests for the half and full p-curve are also both significant (\(p_{Full}<0.001, p_{Half}<0.001\)). This means that the p-curve is heavily right-skewed, indicating that evidential value is present in our data

The test for flatness

If there is evidential value behind our data, the p-curve should also not be flat. Through eyeballing, we see that this is pretty much the case here. The tests for flatness are both not significant (\(p_{Full}=0.9887, p_{Binomial}=0.7459\)).



Figure 2: “input_fit.png”

This plot estimates the power behind our data, meaning if we have sufficient studies with sufficient participants to find a true effect if it exists. A conventional threshold for optimal power is 80%, but P-curve can even assess evidential value if studies are underpowered. In our case, the the power estimate is 90%.



Figure 3: “input_cumulative.png”

This plot provides sensitivity analyses in which the highest and lowest p-values are dropped.

9.2.3 Estimating the “true” effect of our data

To estimate the true effect of our data with p-curve (much like the Duval & Tweedie trim-and-fill procedure), we can use the plotloss function described and made openly available by Simonsohn et al. (Simonsohn, Nelson, and Simmons 2014a).

The code for this function is quite long, so it is not displayed here. It can accessed using this link.

Again, R doesn’t know this function yet, so we have to let R learn it by copying and pasting the code underneath in its entirety into the console on the bottom left pane of RStudio, and then hit Enter ⏎.

For the plotloss function, we only have to provide the data to be used to find the “true” effect size underlying the data, and a range of effect sizes in which the function should search for the true effect, delineated by dmin and dmax.

I will use my metacont data again here, and will search for the true effect between \(d=0\) to \(d=1.0\).

plotloss(data=metacont,dmin=0,dmax=1)

## NULL

The function provides an effect estimate of \(d=0.64\).

It should be noted that this chapter should only be seen as an introduction into p-Curve, which should not be seen as comprehensive.

Simonsohn et al. (Simonsohn, Simmons, and Nelson 2015) also stress that P-Curve should only be used for outcome data which was actually of interest for the Authors of the specific article, because those are the one’s likely to get p-hacked. They also ask meta-researchers to provide a detailed table in which the reported results of each outcome data used in the p-Curve is documented (a guide can be found here).

It has also been shown that P-Curve’s effect estimate are not robust when the heterogeneity of a meta-analyis is high (I2 > 50%). Van Aert et al. (Aert, Wicherts, and Assen 2016) propose not to determine the “true” effect using P-Curve when heterogeneity is high (defined as I2 > 50%).

A poosible solution for this problem might be to reduce the overall heterogeneity using outlier removal, or to p-Curve results in more homogeneous subgroups.

References

Simonsohn, Uri, Leif D Nelson, and Joseph P Simmons. 2014a. “P-Curve and Effect Size: Correcting for Publication Bias Using Only Significant Results.” Perspectives on Psychological Science 9 (6). Sage Publications Sage CA: Los Angeles, CA: 666–81.

Head, Megan L, Luke Holman, Rob Lanfear, Andrew T Kahn, and Michael D Jennions. 2015. “The Extent and Consequences of P-Hacking in Science.” PLoS Biology 13 (3). Public Library of Science: e1002106.

Lüdecke, Daniel. 2018. Effect Size Computation for Meta Analysis. https://CRAN.R-project.org/package=esc.

Del Re, AC. 2013. “Compute. Es: Compute Effect Sizes. R Package Version 0.2-2.” R-project.

Hong, Y. 2011. “Poibin: The Poisson Binomial Distribution.” R package version.

Simonsohn, Uri, Joseph P Simmons, and Leif D Nelson. 2015. “Better P-Curves: Making P-Curve Analysis More Robust to Errors, Fraud, and Ambitious P-Hacking, a Reply to Ulrich and Miller (2015).” American Psychological Association.

Aert, Robbie CM van, Jelte M Wicherts, and Marcel ALM van Assen. 2016. “Conducting Meta-Analyses Based on P Values: Reservations and Recommendations for Applying P-Uniform and P-Curve.” Perspectives on Psychological Science 11 (5). Sage Publications Sage CA: Los Angeles, CA: 713–29.

banner