ACS Fall 2023

Lab automation, data modeling, and AI agents for accelerated catalyst discovery


Advances in artificial intelligence (AI), machine learning (ML), and automated experimentation combined with large-scale public materials data repositories are poised to vastly accelerate the discovery and development within polymer science. Despite this potential, the issues of experimental data representation along with the inherent architectural complexity and stochasticity of polymeric materials present significant challenges in the development of foundational AI models for polymers. Besides continuous and discontinuous experimentation platforms, we developed an extensible domain specific language, termed Chemical Markdown Language (CMDL), as a foundational layer to data science analyzes. CMDL offers a highly flexible data model for the organization of experimental data and polymeric materials, facilitating the representation of widely different experiment and data types in a consistent manner. CMDL enabled facile ingestion into AI/ML pipelines and the creation of polymer informatics. Here, we demonstrate the utility of this approach through the AI-enabled generation of new ring-opening polymerization catalysts as well as block and statistical co-polymer structures. In both cases, experimental validation confirmed the viability of predicted candidates. Overall, these results reveal how flexible and extensible data representation systems can facilitate translation of historical experimental data into meaningful predictive models.