PaccMannRL: De novo generation of hit-like anticancer molecules from transcriptomic data via reinforcement learning
- Jannis Born
- Matteo Manica
- et al.
- 2021
- iScience
PaccMannRL is an interdisciplinary framework that attepts to accelerate de-novo drug discovery. Our current work is focused on targeted cancer therapies and applies a precision medicine perspective. We are developing a conditional deep generative model that integrates biomolecular information of a tumor directly into the design process. This framework systematically bridges systems biology and drug discovery as it tailors novel anticancer drugs directly to the cell profiles of interest.
Our framework contains two separate generative models, one for molecules and one for transcriptomics. These models are pretrained separately but then fused together to embody the conditional drug generator. The compounds proposed by the generator are assessed through virtual drug screening assays. This is achieved by PaccMann (an interpretable deep learning model to predict cancer drug sensitivity) that was previously developed in our lab.
Paccmann integrates three key pillars of drug sensitivity:
The molecular structure of compounds
Transcriptomic profiles of cancer cells
Prior knowledge about interactions among proteins within cells
PaccMann is a novel approach to predict anticancer compound sensitivity by means of multi-modal attention-based neural networks.
Our model ingests a drug-cell pair consisting of SMILES encoding of a compound and the gene expression profile of a cancer cell and predicts an IC50 sensitivity value. Gene expression profiles are encoded using an attention-based encoding mechanism that assigns high weights to the most informative genes.
SMILES are encoded using an attention-based encoder that highlights the most relevant structural features of the compound. Thanks to these encoders, PaccMann outperforms deep-learning models that use engineered fingerprints. Furthermore, the adoption of attention-based encoders enhance interpretability and enable us to identify genes, bonds and atoms that are used by the network to make a prediction, providing useful insights into both drug discovery and precision medicine settings.