Computational Systems Biology

Developing predictive models for precision medicine.

Archived

Overview

With the advances of high-throughput experimental techniques, biomedical research is turning into information science. This requires the use of machine and deep-learning approaches, statistics and mathematical modelling. Individual cellular processes that comprise the interplay of several molecular players, such as cell signaling, can now be quantitatively characterized to allow a systematic view of biological processes. A better understanding of biological processes is crucial in order to provide robust predictive models that improve disease prognoses and treatment strategies. Our group is exploiting a large variety of data — multi-omics datasets, single-cell proteomics and mass spectrometry-based quantitative proteomics — to dissect the molecular mechanisms of cancer. Our goal is to develop predictive models for precision medicine.

Data

Omics: RNAseq, CNV, SNP, miRNA, SWATH–MS
Single cell: Mass cytometry
Clinical data: Survival outcome, treatment
Literature: Publications
Compound structure: SMILES, graph representation of molecules
Networks: protein–protein interactions, pathways

Methods

Machine learning: Deep learning, dimensionality reduction, clustering, classification, generative models
Statistical inference: Probabilistic models, network inference
Mathematical modeling: Stochastic hybrid models, Boolean networks

Research goals

Tumor heterogeneity
Leading drivers of cancer
Molecular mechanisms

Personalized medicine

Patient stratification
Early diagnosis
Targeted treatment

Research

At IBM Research in Zurich, we develop novel approaches to analyze different molecular levels of high-throughput data. From single-cell to cell population-averaged data (proteomics, transcriptomics), we aim to integrate multiple layers of genome-scale information. This, in combination with clinical information and prior knowledge through literature mining, enables us to understand molecular mechanisms and explore applications to personalised medicine.

Our main research projects include, but not are limited to, studying cell-to-cell heterogeneity, integrative multi-omics analysis, dynamic network inference and robust biomarker discovery, most of which are applied in the case of cancer. Recently, we focused on anticancer drug modelling, specifically on leveraging biomarker information into generative models for de-novo drug design, attempting to bridge systems biology and anticancer drug discovery.

We gratefully acknowledge our numerous collaborations with university hospitals, research institutes and universities that work alongside our team in many of our projects.

Research topics

Interpretability for machine learning and computational biology

Understanding real-world datasets is often challenging due to their size, complexity and/or poor knowledge about the problem to be tackled (i.e. electronic health records, OMICS data, etc.).

To achieve high accuracy for important tasks, equally complex machine/deep-learning models are usually used. In many situations, the decisions achieved by such automated systems can have significant—and potentially deleterious—consequences.

In biology and healthcare, interpretability becomes important for three main reasons.

1. Trust

For example, doctors and patient need to be confident about the decision achieved by a deployed model. By providing the rationale behind a decision could make a model more trustable.

2. Debugging

A model could return unexpected predictions, possibly indicating poor performance. Interpretability could help by shedding light on the causes behind poor performance, such as unfair dataset bias or poor model training.

3. Generating biological hypotheses

Surprising results do not always have a negative connotation. Rather, they might be due to the trained model leveraging a true pattern in the data that is unknown even to field experts, such as an unknown protein–protein interaction. Interpretable methods can potentially uncover these patterns, which can then be used as the basis for novel biological hypotheses.

Tumor heterogeneity

Tumor cells exhibit a high degree of variability in terms of morphology, phenotype, metastatic potential and underlying molecular profile. This heterogeneity is present not only across different patients (inter-tumor heterogeneity) but also within the same tumor (intra-tumor heterogeneity) and has emerged as an inherent property of cancer.

Identifying the sources of heterogeneity and its implications in clinical outcomes, such as response to therapy or ability to metastasize, has become a cornerstone for the development of effective disease management strategies.

Multimodal data integration

Developing a predictive computational technology to exploit and integrate multiple molecular and clinical data.

Read more about multimodal data integration.

Research assets

PaccMann

Anticancer drug modelling for precision medicine.

INtERAcT

Automatic text mining and analysis.

PIMKL

Pathway-induced multiple kernel learning.

CellCycleTRACER

A novel computational method to quantify cell cycle and cell volume variability.

Chimaera

Estimating the frequency of genetic alterations.

COSIFER

Consensus inference of molecular networks.

Technical resources

Funding

We gratefully acknowledge generous funding from SystemsX.ch, SNF and the European Union.

Publications

Context-specific interaction networks from vector representation of words
- - Matteo Manica
  - Roland Mathis
  - et al.
- 2019
- Nature Machine Intelligence
A probabilistic model of the germinal center reaction
- - Marcel Jan Thomas
  - Ulf Klein
  - et al.
- 2019
- Frontiers in Immunology
PIMKL: Pathway-Induced Multiple Kernel Learning
- - Matteo Manica
  - Joris Cadow
  - et al.
- 2019
- npj Systems Biology and Applications
PaccMann: Prediction of anticancer compound sensitivity with multi-modal attention-based neural networks
- - Ali Kazemi Oskooei
  - Jannis Born
  - et al.
- 2018
- NeurIPS 2018
Mixed-precision in-memory computing
- - Manuel Le Gallo
  - Abu Sebastian
  - et al.
- 2018
- Nature Electronics
CellCycleTRACER accounts for cell cycle and volume in mass cytometry data
- - Maria Anna Rapsomaniki
  - Xiao Kang Lun
  - et al.
- 2018
- Nature Communications

Resources

Blog Post

Novel AI tools to accelerate cancer research

IBM Research22 Jul 2019

Blog Post

Deciphering Breast Cancer Heterogeneity Using Machine Learning

IBM Research16 May 2019