Replicating a challenging study: it's all about sharing the details.

June 27th, 2017, Megan Showalter


megan showalter

Working on the replication attempt of The common feature of leukemia-associated IDH1 and IDH2 mutations is a neomorphic enzyme activity converting alpha-ketoglutarate to 2-hydroxyglutarate by Ward et al. was a challenging and rewarding experience. It forced us as a group to think about the differences between replication and validation, and how method sections often are lacking information regarding methodology that could impact replication. It is published in eLife here

The most challenging section to replicate was the metabolomics section, due to the inherent complexity of metabolomics studies. Methodology, data processing and normalizations are often specific to each lab. In our case the authors cited additional method papers, a practice we found to be helpful and use often. This way the method section doesn’t need to include every detail, since the published method paper includes all the information necessary for replication. Another practice that can be helpful is the inclusion of detailed methods in supplementary method files. These workarounds are only needed because authors often feel they are not able to include everything in the main text, due to word limits.

For our replication attempt we deposited all data into the NIH Metabolomics Workbench, a public metabolomics data repository that also includes detailed study design and methods sections. We hope that this, in addition to the Registered Report and Replication Study, will provide all details needed for others to build upon our work.

The field of metabolomics presents unique challenges for reproducibility. It is a young field and the reporting standards are still evolving. The community has published guidelines for reporting metabolomic findings [1], but unfortunately, they have still not been adopted by many members of the community.

Metabolomics reproducibility has four main challenges: compound identification, methodology, data processing and statistical analysis. The first three are unique to the field of metabolomics. Compound identification is the paramount issue for reproducibility. Many labs report identifications of compounds that have not been rigorously asserted. The annotations made by each individual lab are their final responsibility, but lab size and experience with metabolomics can impact the quality of those annotations. Large labs have extensive chemical standard inventories and access to in-house MS/MS fragmentation spectra, retention time and m/z often not available to smaller labs. To help bridge the gap, publicly available MS/MS spectra databases as well as numerous published m/z and retention time methods and libraries can be used.

Yet even with these tools available, many published reports lack basic information regarding the annotations regarded as important to their findings. Studies should be required to provide, at a minimum, MS/MS spectra of their experimental compound compared to chemical standard, or at least publicly available spectra, to confirm their annotation. Instead, many studies rely upon accurate mass annotations alone, often choosing a compound they prefer over others with the same accurate mass, and reporting experimental findings as that compound.

Metabolomics data acquisition and processing methods also present challenges to reproducibility because they are inherently complex and vary by lab. In addition, authors are not required, or are unsure where, to deposit metabolomics data into public repositories, which is standard practice for other omics. Without these data, it is nearly impossible for others to replicate original author findings.

While many of the issues around reproducibility of these methods can be solved by detailed method sections, it does raise interesting questions. Are the biological phenotypes of interest strong enough to be reproducible regardless of the mass spectrometry method or data processing used? Should they be? For this replication attempt we used the same methods as the authors, but from the numerous papers published on the 2HG, it is clear the biological phenotype has reproduced using many different methods. This shows that when done correctly, metabolomic findings are reproducible using original or adapted methods across numerous labs.


1. Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 3, 211-221, doi:10.1007/s11306-007-0082-2 (2007).


Recent Blogs

The Content of Open Science

What Second Graders Can Teach Us About Open Science

What's Going on With Reproducibility?

Open Science and the Marketplace of Ideas

3 Things Societies Can Do to Promote Research Integrity

How to Manage and Share Your Open Data

Interview with Prereg Challenge Award Winner Dr. Allison Skinner

Next Steps for Promoting Transparency in Science

Public Goods Infrastructure for Preprints and Innovation in Scholarly Communication

A How-To Guide to Improving the Clarity and Continuity of Your Preregistration

Building a Central Service for Preprints

Three More Reasons to Take the Preregistration Challenge

The Center for Open Science is a Culture Change Technology Company

Preregistration: A Plan, Not a Prison

How can we improve diversity and inclusion in the open science movement?

OSF Fedora Integration, Aussie style!

Replicating a challenging study: it's all about sharing the details.

How Preregistration Helped Improve Our Research: An Interview with Preregistration Challenge Awardees

Some Examples of Publishing the Research That Actually Happened

Are reproducibility and open science starting to matter in tenure and promotion review?

The IRIS Replication Award and Collaboration in the Second Language Research Community

We Should Redefine Statistical Significance

Some Cool New OSF Features

How Open Source Research Tools Can Help Institutions Keep it Simple

OSF Add-ons Help You Maximize Research Data Storage and Accessibility

10 Tips for Making a Great Preregistration

Community-Driven Science: An Interview With EarthArXiv Founders Chris Jackson, Tom Narock and Bruce Caron

A Preregistration Coaching Network

Why are we working so hard to open up science? A personal story.

One Preregistration to Rule Them All?

Using the wiki just got better.

Transparent Definitions and Community Signals: Growth in the Open Science Community

We're Committed to GDPR. Here's How.

Preprints: The What, The Why, The How.

The Prereg Challenge Is Ending. What's Next?

We are Now Registering Preprint DOIs with Crossref

Using OSF in the Lab

Psychology's New Normal

How Open Commenting on Preprints Can Increase Scientific Transparency: An Interview With the Directors of PsyArxiv, SocArxiv, and Marxiv

The Landscape of Open Data Policies

Open Science is a Behavior.

Why pre-registration might be better for your career and well-being

Interview: Randy McCarthy discusses his experiences with publishing his first Registered Report

Towards minimal reporting standards for life scientists

Looking Back on the Prereg Challenge and Forward To More Credible Research

OSF: Origin, growth, and what’s next

A Critique of the Many Labs Projects

The Rise of Open Science in Psychology, A Preliminary Report

Strategy for Culture Change

New OSF Registries Enhancements Improve Efficiency and Quality of Registrations

Registered Reports and PhD’s – What? Why? How? An Interview with Chris Chambers

How to Collaborate with Industry Using Open Science

How to Build an Open Science Network in Your Community

Seven Reasons to Work Reproducibly

COS Collaborates with Case Medical Research to Support Free Video Publishing for Clinicians and Researchers

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.