J. Chem. Inf. Model.

Formulation Graphs for Mapping Structure-Composition of Battery Electrolytes to Device Performance

Download paper


Advanced computational methods are being actively sought to address the challenges associated with the discovery and development of new combinatorial materials such as formulations. A widely adopted approach involves domain-informed high-throughput screening of individual components that can be combined into a formulation. This manages to accelerate the discovery of new compounds for a target application but still leaves the process of identifying the right ‘formulation’ from the shortlisted chemical space largely a laboratory experiment-driven process. We report a deep learning model, Formulation Graph Convolution Network (F-GCN), that can map the structure-composition relationship of the formulation constituents to the property of liquid formulation as a whole. Multiple GCNs are assembled in parallel that featurize formulation constituents domain-intuitively on the fly. The resulting molecular descriptors are scaled based on the respective constituent's molar percentage in the formulation, followed by integration into a combined formulation descriptor that represents the complete formulation to an external learning architecture. The use-case of the proposed formulation learning model is demonstrated for battery electrolytes by training and testing it on two exemplary datasets representing electrolyte formulations vs. battery performance - one dataset is sourced from literature about Li/Cu half-cells, while the other is obtained by lab experiments related to lithium-iodide full-cell chemistry. The model is shown to predict performance metrics like Coulombic Efficiency (CE) and specific capacity of new electrolyte formulations with the lowest reported errors. The best-performing F-GCN model uses molecular descriptors derived from molecular graphs that are informed with HOMO-LUMO and electric moment properties of the molecules using a knowledge transfer technique.