Poster

LEOPARD: Linking Experimental Observations with Predictive Algorithms to reveal Response Dynamics for the single cell

Abstract

Background: New omics technologies (e.g. scRNA-seq) have evolved along with AI methods,1,2 resulting in Biomedical Foundation Models (BMFM)3. BMFM learn universal representation of omics data and via transfer learning provide tremendous opportunities4 for accelerating scientific discovery. BMFM for RNA5 is the computational core for our work, presented as a repeatable framework – LEOPARD. Methods: Three “methodical agents”– comparator, predictor and discoverer - derived from BMFM are utilized to demonstrate LEOPARD results. The comparator agent quantifies proportionality and similarity of cell types between in vitro and in vivo studies; predictor quantifies gene expression level (in silico) on gene perturbation; and discoverer identifies key genes (aka attributions) 6. As preliminaries, we use “labeled” data from in vitro system (hALI gut epithelial system7, day 7 and day2) and a human (in vivo) study8. We compare Goblet cell (GC) types in vivo and in vitro, predict cellular milieu on knock-out of a key transcription factor (Atoh1), and attribute important genes for GC development. Translational validity7 is explored via multiple inference runs and random data splits (for confidence scores). Results: We observed high concordance for GCs (0.93±0.017; 0.904±0.0057, i.e., hALI to biopsy and reciprocal respectively @day7); an expected milieu of differentiating cells (i.e. predominantly progenitor cells) on knock-out (@day 7); and CADPS, PTPRN2, SYTL2, H3F3A, SOX4 as important genes in GC development (@day 2). Conclusions: LEOPARD framework opens doors for many biological investigations. More specifically, it can help funnel the high number of genetic/molecular target candidates to optimize expensive in vitro investigations.

Related