2.1 Good practices in data analysis (X)

2.1.1 Why reproducability?

Terminology: Replication, replication (King 1995)
- But replication vs. reproduction (terminology!)
Errors
- A crisis… (e.g. Open Science Collaboration 2015) that should be avoided (e.g. Psychoticism)
- Manual steps (e.g. manual copy/paste) introduces errors
- Reproducible documents allow for automatization (counter argument?)
Access:
- Taxpayers (= researchers) pay for research → should have access
- Better all humans → human progress! (Sci-hub controversy)
- Implies relying on open-source software
- Access in 100 years.. will STATA still exist?
Memory
- You will forget what you did.. think of others..
- Reproducable document helps you trace your steps
- Ideally all stages of workflow
Efficiency
- Automatization → paper revisions much faster

2.1.2 Reproducability: My current approach

Every researcher has his own optimized setup..
Mine is summarized in this template: Writing a reproducible paper with R Markdown and Pagedown
- Please use this for term paper and follow the corresponding recommendations etc.
- See also (P. Bauer 2018)
Tools: R, Rmarkdown and Pagedown
Final product (e.g. scientific article, statistical report) produced by single .rmd file
- Potentially is reproducable in 100 years! (open-sou)
- Ideally encompasses all stages of workflow (not always possible)
- Cache estimations (some contain randomness)!
The criteria of “good” evidence change³
Initiatives such as…
- ROpenSci
- Center for Open Science + Open Science Framework
- Pre-registration (Pros & Cons)
- Harvard dataverse

References

Acharya, Avidit, Matthew Blackwell, and Maya Sen. 2016. “Explaining Causal Findings Without Bias: Detecting and Assessing Direct Effects.” Am. Polit. Sci. Rev. 110 (3): 512–29.

Bauer, Paul. 2018. “Writing a Reproducible Paper in R Markdown,” May.

Gill, Jeff. 1999. “The Insignificance of Null Hypothesis Significance Testing.” Polit. Res. Q. 52 (3): 647–74.

King, Gary. 1995. “Replication, Replication.” PS, Political Science & Politics 28 (3): 444–52.

———. 2010. “A Hard Unsolved Problem? Post-Treatment Bias in Big Social Science Questions.” In Hard Problems in Social Science” Symposium, Harvard University. scholar.harvard.edu.

Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): aac4716.

See for instance the discussion surrounding the use of p-values/statistical significance(e.g. Gill 1999) and current discussion about post-treatment bias (e.g. King 2010; Acharya, Blackwell, and Sen 2016).↩︎