Workshop paper

Enabling Accurate and Interpretable Property Prediction with TDiMS in Large Molecules

Abstract

In materials discovery, descriptors that are both accurate and interpretable are essential for predicting molecular properties. However, existing descriptors, including neural network-based approaches, often struggle to capture long-range interactions between substructures. We analyze the previously proposed descriptor TDiMS, which models nonlocal structural relationships via average topological distances between substructure-pairs. While TDiMS has shown strong performance, its size dependence had not been systematically assessed. Our analysis reveals that TDiMS is particularly effective for larger molecules, where long-range interactions are critical and conventional descriptors underperform. SHAP-based analysis highlights that its predictive power derives from distant substructure-pair features. In addition to improved accuracy, TDiMS offers interpretable features that provide chemical insight, potentially accelerating molecular design and discovery.