A Multi-View Mixture-of-Experts based on Language and Graphs for Molecular Properties Prediction
Abstract
Recent progress in chemical-based machine learning utilizes a two-step process - pre-training on unlabeled corpora and fine-tuning on specific tasks - to enhance model capacity. Emphasizing the growing need for training efficiency, Mixture-of-Experts (MoE) efficiently scales model capacity, particularly vital for large-scale models. In the MoE architecture, sub-networks of multiple experts are selectively tailored through a gating network, optimizing overall model performance. Extending this, a Multi-View Mixture-of-Experts enhances model robustness and accuracy by fusing embeddings from different natures. Here, we introduce Mol-MVMoE, a novel approach for small molecules by fusing latent spaces from diverse chemical-based models. Utilizing a gating network to define and assign weights to different perspectives, Mol-MVMoE emerges as a robust framework for small molecule analysis. We assessed Mol-MVMoE using 11 benchmark datasets from MoleculeNet, where it outperformed competitors in 9 of them. We also provide a deep analysis of the results obtained with the QM9 dataset, where Mol-MVMoE consistently performed better than its state-of-the-art competitors. Our study highlights the potential of latent space fusion and different perspectives integration for advancing molecular property prediction. This not only signifies current advancement but also promises future refinements with the inclusion large-scale models.