Inferring clonal composition from multiple tumor biopsies
- Matteo Manica
- Hyunjae Ryan Kim
- et al.
- 2020
- npj Systems Biology and Applications
With the advances of high-throughput experimental techniques, biomedical research is turning into information science. This requires the use of machine and deep-learning approaches, statistics and mathematical modelling. Individual cellular processes that comprise the interplay of several molecular players, such as cell signaling, can now be quantitatively characterized to allow a systematic view of biological processes. A better understanding of biological processes is crucial in order to provide robust predictive models that improve disease prognoses and treatment strategies. Our group is exploiting a large variety of data — multi-omics datasets, single-cell proteomics and mass spectrometry-based quantitative proteomics — to dissect the molecular mechanisms of cancer. Our goal is to develop predictive models for precision medicine.
At IBM Research in Zurich, we develop novel approaches to analyze different molecular levels of high-throughput data. From single-cell to cell population-averaged data (proteomics, transcriptomics), we aim to integrate multiple layers of genome-scale information. This, in combination with clinical information and prior knowledge through literature mining, enables us to understand molecular mechanisms and explore applications to personalised medicine.
Our main research projects include, but not are limited to, studying cell-to-cell heterogeneity, integrative multi-omics analysis, dynamic network inference and robust biomarker discovery, most of which are applied in the case of cancer. Recently, we focused on anticancer drug modelling, specifically on leveraging biomarker information into generative models for de-novo drug design, attempting to bridge systems biology and anticancer drug discovery.
We gratefully acknowledge our numerous collaborations with university hospitals, research institutes and universities that work alongside our team in many of our projects.
Understanding real-world datasets is often challenging due to their size, complexity and/or poor knowledge about the problem to be tackled (i.e. electronic health records, OMICS data, etc.).
To achieve high accuracy for important tasks, equally complex machine/deep-learning models are usually used. In many situations, the decisions achieved by such automated systems can have significant—and potentially deleterious—consequences.
In biology and healthcare, interpretability becomes important for three main reasons.
For example, doctors and patient need to be confident about the decision achieved by a deployed model. By providing the rationale behind a decision could make a model more trustable.
A model could return unexpected predictions, possibly indicating poor performance. Interpretability could help by shedding light on the causes behind poor performance, such as unfair dataset bias or poor model training.
Surprising results do not always have a negative connotation. Rather, they might be due to the trained model leveraging a true pattern in the data that is unknown even to field experts, such as an unknown protein–protein interaction. Interpretable methods can potentially uncover these patterns, which can then be used as the basis for novel biological hypotheses.
Tumor cells exhibit a high degree of variability in terms of morphology, phenotype, metastatic potential and underlying molecular profile. This heterogeneity is present not only across different patients (inter-tumor heterogeneity) but also within the same tumor (intra-tumor heterogeneity) and has emerged as an inherent property of cancer.
Identifying the sources of heterogeneity and its implications in clinical outcomes, such as response to therapy or ability to metastasize, has become a cornerstone for the development of effective disease management strategies.
Read more about modeling spatial heterogeneity of the tumor microenvironment.
Read more about quantifying biological heterogeneity from single-cell data.
Developing a predictive computational technology to exploit and integrate multiple molecular and clinical data.
Read more about multimodal data integration.
Anticancer drug modelling for precision medicine.
Automatic text mining and analysis.
Pathway-induced multiple kernel learning.
A novel computational method to quantify cell cycle and cell volume variability.
Estimating the frequency of genetic alterations.
Consensus inference of molecular networks.
We gratefully acknowledge generous funding from SystemsX.ch, SNF and the European Union.