Workshop paper

Accelerating Material Discovery for Metal Organic Frameworks using Large Language Models

Abstract

Recent advancements in Machine Learning (ML) have substantially accelerated the material discovery field, yet the utilization of Large Language Models (LLMs) in the Metal-Organic Frameworks (MOFs) research has received limited attention. This work leverages LLMs to build a new set of models that accelerate MOF material discovery. Our strategy relies on pre-training the Granite model using a single H100 GPU on a combination of selective chemical journals and structural data from the PubChem database. Our evaluation demonstrates that this pre-training strategy significantly enhances the performance of LLMs in predicting MOF properties, especially in limited-resource task scenarios. We hope this work can motivate future research to explore the potential of LLMs in enhancing material discovery to build robust and efficient Metal-Organic Frameworks models.