1. Academic Validation
  2. MSnLib: efficient generation of open multi-stage fragmentation mass spectral libraries

MSnLib: efficient generation of open multi-stage fragmentation mass spectral libraries

  • Nat Methods. 2025 Oct;22(10):2028-2031. doi: 10.1038/s41592-025-02813-0.
Corinna Brungs # 1 2 Robin Schmid # 3 4 Steffen Heuckeroth 5 Aninda Mazumdar 6 Matúš Drexler 1 Pavel Šácha 1 Pieter C Dorrestein 7 Daniel Petras 8 Louis-Felix Nothias 9 10 Václav Veverka 1 11 Radim Nencka 1 Zdeněk Kameník 6 Tomáš Pluskal 12
Affiliations

Affiliations

  • 1 Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czechia.
  • 2 Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
  • 3 Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czechia. rschmid1789@gmail.com.
  • 4 Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA. rschmid1789@gmail.com.
  • 5 Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany.
  • 6 Institute of Microbiology of the Czech Academy of Sciences, Prague, Czechia.
  • 7 Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA.
  • 8 Department of Biochemistry, University of California Riverside, Riverside, CA, USA.
  • 9 Université Côte d'Azur, CNRS, ICN, Nice, France.
  • 10 Interdisciplinary Institute for Artificial Intelligence (3iA) Côte d'Azur, Sophia Antipolis, Valbonne, France.
  • 11 Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia.
  • 12 Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czechia. tomas.pluskal@uochb.cas.cz.
  • # Contributed equally.
Abstract

Untargeted high-resolution mass spectrometry is a key tool in clinical metabolomics, natural product discovery and exposomics, with compound identification remaining the major bottleneck. Currently, the standard workflow applies spectral library matching against tandem mass spectrometry (MS2) fragmentation data. Multi-stage fragmentation (MSn) yields more profound insights into substructures, enabling validation of fragmentation pathways; however, the community lacks open MSn reference data of diverse natural products and Other chemicals. Here we describe MSnLib, a machine learning-ready open resource of >2 million spectra in MSn trees of 30,008 unique small molecules, built with a high-throughput data acquisition and processing pipeline in the open-source software mzmine.

Figures