1. Academic Validation
  2. Closing the loop: Teaching single-cell foundation models to learn from perturbations

Closing the loop: Teaching single-cell foundation models to learn from perturbations

  • bioRxiv. 2025 Jul 12:2025.07.08.663754. doi: 10.1101/2025.07.08.663754.
Yash Pershad 1 2 Tarak N Nandi 3 4 Joseph C Van Amburg 1 2 Alyssa C Parker 1 2 Luiza Ostrowski 5 Hannah K Giannini 1 2 David Ong 1 J Brett Heimlich 1 Esther A Obeng 6 Katrin Ericson 7 Anupriya Agarwal 5 Ravi K Madduri 3 4 Alexander G Bick 1 2
Affiliations

Affiliations

  • 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
  • 2 Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA.
  • 3 Data Science and Learning, Argonne National Laboratory, Lemont, IL, USA.
  • 4 Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA.
  • 5 Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA.
  • 6 Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, USA.
  • 7 RUNX1 Research Program, Santa Barbara, California, USA.
Abstract

The application of transfer learning models to large scale single-cell datasets has enabled the development of single-cell foundation models (scFMs) that can predict cellular responses to perturbations in silico. Although these predictions can be experimentally tested, current scFMs are unable to "close the loop" and learn from these experiments to create better predictions. Here, we introduce a "closed-loop" framework that extends the scFM by incorporating perturbation data during model fine-tuning. Our closed-loop model improves prediction accuracy, increasing positive predictive value in the setting of T-cell activation three-fold. We applied this model to RUNX1-familial platelet disorder, a rare pediatric blood disorder and identified two therapeutic targets (mTOR and CD74-MIF signaling axis) and two novel pathways (protein kinase C and phosphoinositide 3-kinase). This work establishes that iterative incorporation of experimental data to foundation models enhances biological predictions, representing a crucial step toward realizing the promise of "virtual cell" models for biomedical discovery.

Figures
Products