J. Chem. Inf. Model.

Combining docking pose rank and structure with deep learning improves protein−ligand binding mode prediction over a baseline docking approach

Download paper


We present a simple, modular graph-based convolutional neural network that takes structural information from protein−ligand complexes as input to generate models for activity and binding mode prediction. Complex structures are generated by a standard docking procedure and fed into a dual-graph architecture that includes separate subnetworks for the ligand bonded topology and the ligand-protein contact map. Recent work has indicated that data set bias drives many past promising results derived from combining deep learning and docking. Our dual-graph network allows contributions from ligand identity that give rise to such biases to be distinguished from effects of protein−ligand interactions on classification. We show that our neural network is capable of learning from protein structural information when, as in the case of binding mode prediction, an unbiased data set is constructed. We next develop a deep learning model for binding mode prediction that uses docking ranking as input in combination with docking structures. This strategy mirrors past consensus models and outperforms a baseline docking program (AutoDock Vina) in a variety of tests, including on cross-docking data sets that mimic real-world docking use cases. Furthermore, the magnitudes of network predictions serve as reliable measures of model confidence.


20 Feb 2020


J. Chem. Inf. Model.