Uncertainty Analysis in Predicting Molecular Properties Using Chemical Foundation Models
Abstract
Large pre-trained foundation models are becoming prevalent and have a high risk impact in domains of the physical sciences. Uncertainty analysis of prediction results can help engender trust in the model outcomes and indicate reliability to decision makers. In this paper, we introduce a method for uncertainty quantification and characterization tailored to chemical foundation models, with a focus on predicting molecular properties. Our approach is tested on a variety of datasets including the widely-used QM9 dataset, ESOL, FreeSolv, Lipophilicity and $LD_{50}$. We apply our method to a SMILES-based foundation model, comparing the uncertainty profiles between fine-tuned and frozen model versions. We also provide comparison to a conformal prediction method: normalized conformal regressor. Results demonstrate the effectiveness of our approach in identifying and quantifying uncertainties, offering insights into model reliability, the impact of model fine-tuning on prediction results and a comparison to well known method.