How to use AI to discover new drugs and materials with limited data
IBM Research is working on new ways to generate material designs with AI with dozens of examples for the training model, instead of the tens of thousands often required.
IBM Research is working on new ways to generate material designs with AI with dozens of examples for the training model, instead of the tens of thousands often required.
Over the last few years, we’ve seen that advances in deep-learning Learn about how using generative models to come up with new ideas, we can dramatically accelerate the pace at which we can discover new molecules, materials, drugs, and more.generative AI models can lead to amazing new ways to discover new molecules for drug and materials discovery. These models, when trained effectively, can provide the spark of inspiration for uncovering molecular combinations that scientists may never have considered trying, or ideas that would've taken years to figure out. What in the past may have taken decades to discover can, in some cases, now be achieved in a matter of months, such as our work accelerating the pace at which we can discover new antimicrobial peptides.
But generative models like these generally require large amounts of training data to learn. This requires time and a lot of energy, and in some cases, there just isn’t enough data available to adequately train the models. By treating molecules as graphs, however, and learning the grammar of the graph, we developed a method that can require tens to hundreds of training examples, rather than the deep learning models that can require up to nearly 100,000 examples. This lets us generate candidates for testing faster and more flexibly, shortening the pipeline for creating new materials, such as pharmaceuticals.
The work, to be presented at the 2022 International Conference on Learning Representations1 (ICLR), dives into this method of molecular generation. Jie Chen, a researcher at the MIT-IBM Watson AI Lab, recently sat down with Thomas Asche, R&D Digitalization lead at Evonik Industries, to discuss with IBM Research’s Shaheen Parks how their work can be used to discover new polymers.
If you'd like to read Jie Chen's presentation he discussed in the video, click here.
Notes
References
-
Guo, M., Thost, V., Beichen Li, B. at al. Data-Efficient Graph Grammar Learning for Molecular Generation. ICLR 2022. 15 Mar 2022. ↩