IBM and University of Liverpool scientists’ algorithm saves 500,000 CPU hours on simulation to quickly identify new materials for gas storage.
Simulation can be a powerful tool for materials discovery but often it is too computationally intense to run the simulations you want at the scale you need to accelerate materials discovery.
We have addressed this challenge by building an algorithm that intelligently selects which simulations are worth running, and focuses resources on them.
In collaboration with the University of Liverpool, we used this algorithm to save over 500,000 CPU hours (CPUh) on an expensive simulation campaign, allowing them to quickly identify new materials for gas storage. We describe our work in a recent paper, “Accelerating Computational Discovery of Porous Solids Through Improved Navigation of Energy Structure Function Maps,” published in Science Advances.1
Our research is part of a family of algorithms called Bayesian optimization, which has a variety of applications in the wider world. But in general, Bayesian optimization answers the question:
With what I know now, what should I do next so that I get the best overall result in the future?
While Bayesian optimization is not as commonly used as it perhaps deserves, we have been pioneering its application outside of its traditional uses, such as hyperparameter tuning. We’ve extended it to cover engineering, materials chemistry, drug discovery, and even quantum computing. In the past, we used it to help optimize the signal integrity of part of IBM's PowerPC chip.
We have packed all of our algorithmic advances into a simple service called IBM Bayesian Optimization (IBO), which means that users can easily get the value from using these algorithms without themselves having to become Bayesian optimization experts.
We think that this technology is particularly suited for the search for new materials—pharmaceuticals, for example. You can think of the way it might achieve this to being broadly similar to how you might hunt for lost keys. You probably wouldn’t search for them systematically, but rather gather information and go from place to place based on the likelihood of where the keys might be.
Our algorithm hunts for promising materials in a similar way: It gathers information, generates updates on where to look, and then decides on the next place to search.
This method is much more efficient than brute force, or simply looking everywhere. It enables users to make dramatic computational savings, allowing them to either do something they previously could not afford to do or to do more with the same budget.
The application of Bayesian optimization to materials discovery is rapidly maturing, and indeed forms a key part of IBM’s Accelerated Discovery strategy.
To apply this technique to a real materials discovery problem, we asked our algorithm to simultaneously hunt for materials which had good gas storage properties, and were also crystallized in such a way that they would be easy to observe in the lab.
In our paper, we show how we achieved this through the extension of an algorithm we previously developed, called Batch Generalized Thompson Sampling. We created it to deal with multiple objectives (in this case the gas storage property score, and the lattice energy score) and showed that it can be applied to accelerate a new computational technique known as Energy Structure Function ESF Maps “describe the possible structures and properties that are available to a candidate molecule” (for simulation).(ESF) Maps.2 These ESF Maps are powerful for computational materials design, but are far too expensive for routine use.
As our method reduces the need to perform many expensive simulations, it also cuts the cost of generating these ESF Maps by 500,000 CPUh—the equivalent to a small supercomputing grant.
There are, however, still challenges to overcome.
For example, it’s important to figure out how to work in highly complex conditions, such as with multiple constraints and objectives, without passing this complexity to the user. Another challenge is to be able to take this information into account from a variety of sources whilst performing the optimization.
These are our next steps. And we are certain to make more progress with our research and speed up material discovery process even further.
Date13 Aug 2021
- Note 1: ESF Maps “describe the possible structures and properties that are available to a candidate molecule” (for simulation). ↩︎
Edward O. Pyzer-Knapp, Linjiang Chen, Graeme M. Day, Andrew I. Cooper. Accelerating Computational Discovery of Porous Solids Through Improved Navigation of Energy Structure Function Maps. Science Advances. Vol. 7, no. 33, eabi4763. (2021). ↩
Pulido, A., Chen, L., Kaczorowski, T. et al. Functional materials discovery using energy–structure–function maps. Nature. 543, 657–664 (2017). ↩