Prediction of Phase Diagrams and Associated Phase Structural Properties
Phase diagrams provide useful information to understand materials, helping to design new materials with better properties. In this work, we used commercially available databases consisting of binary and ternary diagrams to train an AI system able to predict the phases resulting from the mixture of compounds at a given temperature and pressure conditions. The first step was to extract the data from the NIST database of ceramic materials. We processed the NIST 31 database consisting of 29,480 phase diagrams. Each diagram is a collection of points, lines, arrows, and labels that we used to construct a graphical representation of each diagram. The process consisted in parsing these geometrical elements to identify the regions and the corresponding labels of each diagram. We extracted data from 8,649 binary and 5,815 ternary diagrams for 14,464 diagrams. The text labels in each region of phase diagrams referred to a mixture of phase descriptors (solid, liquid, glass/amorphous, vapor), crystal structure descriptors, and chemical formulas. We trained different models to predict these phase descriptors. In addition, we used the ICSD database to train additional models to predict the Bravais lattice, the type of structure, and the local atomic environment. The pipeline in Fig. 1 illustrates how ICSD-trained models can be used to predict structural properties from the chemical formulas of solid systems to give further predictions of the models trained on NIST data. Alternatively, the models trained on ICSD can directly take a user-given input consisting of composition and optionally a temperature and pressure in inputs. The computationally efficient data-driven methods presented in Ref. 1 show how these models can help material to estimate the structure of a mixture of elements in any proportion over a wide temperature range. The encoder-decoder models described in this work can be trained and run for inference via the granular framework from GT4SD 2, an open-source library to accelerate hypothesis generation in scientific discoveries.