43.1 The Replication Standard

Replicability in research ensures:

Credibility – Reinforces trust in empirical studies by allowing independent verification.
Continuity – Enables future research to build upon prior findings, promoting cumulative knowledge.
Visibility – Increases readership and citations, benefiting both individual researchers and the broader academic community.

For research to be replicable, adhering to the replication standard is essential. This standard requires researchers to provide all necessary information—data, code, and methodological details—so that third parties can independently reproduce the study’s findings. While quantitative research often allows for clearer replication, qualitative studies pose challenges due to their depth, contextual nature, and reliance on subjective interpretation.

43.1.1 Solutions for Empirical Replication

Several approaches help address replication challenges in empirical research:

Role of Individual Authors
- Researchers must commit to transparency and provide well-documented data and code.
- Repositories such as the Inter-University Consortium for Political and Social Research (ICPSR) offer secure, long-term storage for replication datasets.
Creation of a Replication Data Set
- A dedicated replication dataset should include original data, relevant supplementary data, and the exact procedures used for analysis.
- Metadata and documentation should be provided to ensure clarity.
Professional Data Archives
- Organizations like ICPSR, Dataverse, and Zenodo facilitate open access to datasets while maintaining proper governance over sensitive information.
- These archives help address data accessibility and preservation issues.
Educational Implications
- Teaching replication strengthens students’ understanding of empirical methods and reproducibility.
- Many graduate programs now incorporate replication studies into coursework, emphasizing their importance in methodological rigor.

43.1.2 Free Data Repositories

Zenodo: Hosted by CERN, it provides a place for researchers to deposit datasets. It’s not subject-specific, so it caters to various disciplines.
figshare: Allows researchers to upload, share, and cite their datasets.
Dryad: Primarily for datasets associated with published articles in the biological and medical sciences.
OpenICPSR: A public-facing version of the Inter-University Consortium for Political and Social Research (ICPSR) where researchers can deposit data without any cost.
Harvard Dataverse: Hosted by Harvard University, this is an open-source repository software application dedicated to archiving, sharing, and citing research data.
Mendeley Data: A multidisciplinary, free-to-use open access data repository where researchers can upload and share their datasets.
Open Science Framework (OSF): Offers both a platform for conducting research and a place to deposit datasets.
PubMed Central: Specific to life sciences, but it’s an open repository for journal articles, preprints, and datasets.
Registry of Research Data Repositories (re3data): While not a repository itself, it provides a global registry of research data repositories from various academic disciplines.
SocArXiv: An open archive for the social sciences.
EarthArXiv: A preprints archive for earth science.
Protein Data Bank (PDB): For 3D structures of large biological molecules.
Gene Expression Omnibus (GEO): A public functional genomics data repository.
The Language Archive (TLA): Dedicated to data on languages worldwide, especially endangered languages.
B2SHARE: A platform for storing and sharing research data sets in various disciplines, especially from European research projects.

43.1.3 Exceptions to Replication

While the replication standard is fundamental to scientific integrity, certain constraints may prevent full adherence. Some common exceptions include:

Confidentiality
- Some datasets contain highly sensitive information (e.g., medical records, personal financial data) that cannot be disclosed, even in a fragmented form.
- Anonymization techniques and data aggregation can sometimes mitigate these concerns, but privacy regulations (e.g., GDPR, HIPAA) impose strict limitations.
Proprietary Data
- Datasets owned by corporations, governments, or third-party vendors often have restricted access due to intellectual property concerns.
- In many cases, researchers can share summary statistics, derived variables, or synthetic versions of the data while respecting proprietary restrictions.
Rights of First Publication
- Some studies involve data embargoes, where researchers must delay public release until initial publications are completed.
- Despite embargoes, the essential data and methodology should eventually be accessible to ensure transparency.

43.1.4 Replication Landscape

Brodeur et al. (2025) finds that while AI-assisted teams improve upon AI-led approaches, human-only teams remain the most effective at detecting major errors and ensuring reproducibility in quantitative social science research.

Human teams and AI-assisted teams achieved similar reproducibility success rates, both significantly outperforming AI-led teams.
Human-only teams were 57 percentage points more successful than AI-led teams (p < 0.001).
Error detection: Human teams identified significantly more major errors than AI-assisted teams (0.7 more errors per team, p = 0.017) and AI-led teams (1.1 more errors per team, p < 0.001).
AI-assisted teams detected 0.4 more errors per team than AI-led teams (p = 0.029) but still fewer than human teams.
Robustness checks: Both human and AI-assisted teams were significantly better than AI-led teams in proposing (25 percentage points, p = 0.017) and implementing (33 percentage points, p = 0.005) comprehensive robustness checks.

Huntington-Klein et al. (2025) uses a three-stage many-analysts design to examine how researcher decisions influence variation in treatment effect estimates.

146 research teams completed the same causal inference task three times under increasingly standardized conditions:

-    **Stage 1:** Few constraints (free-form analysis).

-    **Stage 2:** Prescribed research design.

-    **Stage 3:** Prescribed design plus pre-cleaned data.

Key findings:

-    **Stage 1:** High variation in reported effects (IQR = 3.1 percentage points), with outliers.

-    **Stage 2:** Even greater variation (IQR = 4.0), due to imperfect protocol adherence.

-    **Stage 3:** Lowest variation (IQR = 2.4), suggesting data cleaning substantially reduces result heterogeneity.

Sample size convergence:

-   IQR dropped from 295,187 (Stage 1) to 29,144 (Stage 2), and was effectively zero in Stage 3.

The results highlight the **critical role of data cleaning** in applied microeconomics and suggest **new directions for replication research**.

References

Brodeur, Abel, David Valenta, Alexandru Marcoci, Juan P Aparicio, Derek Mikola, Bruno Barbarioli, Rohan Alexander, et al. 2025. “Comparing Human-Only, AI-Assisted, and AI-Led Teams on Assessing Research Reproducibility in Quantitative Social Science.” I4R Discussion Paper Series.

Huntington-Klein, Nick, Claus C Pörtner, Yubraj Acharya, Matus Adamkovic, Joop Adema, Lameck Ondieki Agasa, Imtiaz Ahmad, et al. 2025. “The Sources of Researcher Variation in Economics.” I4R Discussion Paper Series.