Structured data, which incorporate structures and characteristics of molecules, are extracted by IBM Deep Search and complemented by AI-accelerated simulation, and integrated into MolGX to train the generative AI. MolGX supports simplified molecular-input line-entry system, or SMILES, which represents a line drawing of a molecule as a sequence of characters, such as BrCCOC1OCCCC1. Candidate molecules generated by MolGX can be passed to IBM RXN for Chemistry , which predicts chemical reactions or retrosynthesis pathways and IBM RoboRXN , which automates chemical synthesis using robots.
The development of new materials follows a number of different pathways, depending on both the nature of the problem being pursued and the means of investigation. Breakthroughs in the discovery of new materials span from pure chance, to trial-and-error approaches, to design by analogy to existing systems. While these methodologies have taken us far, the challenges and requirements for new materials are more complex — so too are the demands and issues for which new materials are needed. As we face global problems such as pandemics and climate change, the necessity and urgency to design and develop new medicines and materials at a faster pace and on a molecular scale through to the macroscopic level of a final product is becoming increasingly important.
Our aim in developing MolGX is to accelerate molecular inverse design through state-of-the-art molecular generative AI-model technology. You may have heard of AI engines that can draw realistic images of landscapes or portraits of people that don’t exist. These are called generative models, and rather than use them to create imaginary things, we’ve adapted this technology to automatically generate molecules from desired chemical properties such as “solubility in water” and “heatability” by performing three simple steps: observing and selecting a dataset, training the AI model to predict chemical characteristics within the given parameters, and designing molecular structures based on the model built.
In collaboration with IBM Garage team in Tokyo – including, Makoto Kogo, Takumi Hongo, Kumiko Fujieda – we then made it scalable in the IBM Cloud and accessible with an easy-to-understand user interface with general users in mind. Essentially anyone, even those without advanced IT skills, can experience the cutting edge of materials informatics as well as learn the basics of AI.
We also created MolGX introduces a novel pathway to generating new materials as it reduces the development time and allows for the innovation of new materials that are not bound by fixed ideas.a version for chemistry professionals and industries, which includes additional functionalities such as exportation of results, customized modelling and data uploading to explore beyond the built-in dataset.
Our novel AI-driven molecular inverse-design platform is ready to be deployed and tested with companies making materials, including at IBM, where we have applied it towards the development of a new photoacid generator , an important material in electronics manufacturing.
It offers material R&D three main advantages:
Firstly, it features an algorithm-based encoding and structure generation process, as oppose to a data-driven one; so there is no pre-training of large datasets, nor the major training costs that come with this task.
Secondly, the space and structure generation process are fully interpretable for chemists, and therefore easy to customize chemical insights about molecular structures.
Finally, hierarchical data structures and a clear UI provide a flexible and intuitive workflow.
The MolGX professional version,1 which includes additional functionality, is already being deployed as a service at materials manufacturing company NAGASE & CO., LTD., running on IBM’s and NAGASE’s cloud. Inverse-design of sugar and dye molecules were carried out more than 10 times faster than the performance of human chemists. In addition, the diversity of molecular structures was expanded while still satisfying chemical rationality.
Just imagine all the new materials we could discover at this accelerated speed, and all the problem we could solve. One of the ways to realize a sustainable future is to discover new materials that can address global issues. For example, we could create new materials with properties that can rapidly adsorb carbon dioxide or convert sunlight into electricity without losing energy.
The properties of materials are determined by the shape of the molecules that make up the material, but there are almost infinite patterns in the shape of molecules, which is why it takes an enormous amounts of time to develop new materials – time we can’t afford to waste. There is a growing demand for technologies like MolGX that can optimize materials design in a quicker and more cost-efficient manor by utilizing AI and hybrid cloud. Our platform is a perfect example of how AI and cutting-edge data processing technologies can put us on the fast track to the discovery of innovative materials that can make a significant impact on the environment and on our society.
A free, experience trial version of MolGX is available today using a built-in dataset. IBM Research is also offering an unlimited version with additional functionality including data upload, results exportation, customized modeling, and more with a license.
Takeda, S. et al. Molecular Inverse-Design Platform for Material Industries. in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (ACM, 2020). ↩