How to use AI to discover new drugs and materials with limited data

IBM Research is working on new ways to generate material designs with AI with dozens of examples for the training model, instead of the tens of thousands often required.

Over the last few years, we’ve seen that advances in deep-learning Learn about how using generative models to come up with new ideas, we can dramatically accelerate the pace at which we can discover new molecules, materials, drugs, and more.generative AI models can lead to amazing new ways to discover new molecules for drug and materials discovery. These models, when trained effectively, can provide the spark of inspiration for uncovering molecular combinations that scientists may never have considered trying, or ideas that would've taken years to figure out. What in the past may have taken decades to discover can, in some cases, now be achieved in a matter of months, such as our work accelerating the pace at which we can discover new antimicrobial peptides.

But generative models like these generally require large amounts of training data to learn. This requires time and a lot of energy, and in some cases, there just isn’t enough data available to adequately train the models. By treating molecules as graphs, however, and learning the grammar of the graph, we developed a method that can require tens to hundreds of training examples, rather than the deep learning models that can require up to nearly 100,000 examples. This lets us generate candidates for testing faster and more flexibly, shortening the pipeline for creating new materials, such as pharmaceuticals.

The work, to be presented at the 2022 International Conference on Learning Representations¹ (ICLR), dives into this method of molecular generation. Jie Chen, a researcher at the MIT-IBM Watson AI Lab, recently sat down with Thomas Asche, R&D Digitalization lead at Evonik Industries, to discuss with IBM Research’s Shaheen Parks how their work can be used to discover new polymers.

Watch: Using AI to discover new drugs and materials with limited data

Watch this video on YouTube.

If you'd like to read Jie Chen's presentation he discussed in the video, click here.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

Notes

Note 1: Learn about how using generative models to come up with new ideas, we can dramatically accelerate the pace at which we can discover new molecules, materials, drugs, and more. ↩︎

References

Guo, M., Thost, V., Beichen Li, B. at al. Data-Efficient Graph Grammar Learning for Molecular Generation. ICLR 2022. 15 Mar 2022. ↩

IBM charts a new research path with MIT
Q & A
Kim Martineau
11 May 2026
Where the frontiers of high-speed racing and computing meet
Research
Kim Martineau and Dave Mosher
30 Apr 2026
Introducing the IBM Granite 4.1 family of models
Release
Mike Murphy
29 Apr 2026
Building the future of computing, together
News
Peter Hess
29 Apr 2026

Watch: Using AI to discover new drugs and materials with limited data

Notes

References

Related posts

IBM charts a new research path with MIT

Where the frontiers of high-speed racing and computing meet

Introducing the IBM Granite 4.1 family of models

Building the future of computing, together