28 May 2021
Case study
8 minute read

AI for chemistry: IBM and Evonik Industries to boost material design

Evonik collaborates with IBM Research Europe and the MIT-IBM Watson AI Lab to explore how AI can help accelerate the development and optimization process of materials.

Ever wondered how it’s been possible to make smartphone screens so much stronger than, say, a wine glass? Well, it took more than a decade to achieve a carefully designed mixture of alkali-aluminosilicate, followed by a manufacturing process that strengthens the glass. Or take lithium-ion batteries. Proposed as a concept in 1973, they were finally brought to market in 1991.

You guessed it—material design from scratch and the process of improving products has traditionally been time-consuming. Done as a trial-and-error process, it typically takes some 10 to 20 years to discover a new material and bring it to market, or to keep improving its performance.

We want to change that.

Our team of scientists from IBM Research Europe, Zurich lab is making this possible. We discovered a novel way to apply AI to chemistry and Evonik, a German chemical company producing and selling materials to companies looking to make and refine products, is one of the first industry partners to collaborate with us in our effort to accelerate materials discovery. With our AI algorithms companies like Evonik will be able predict and refine polymer properties, for example, which in turn will reduce the formulation problem down to months rather than 10 or more years.

Needle in the haystack

The pace of developing new materials has long fallen far behind the speed of the market demand. Consumers are desperately looking for more performant, more sustainable and more environmentally friendly materials. Yet, we still rely a lot on serendipity for a material innovation that will create the desired market impact.

One may think that with all the industrial digitalization, R&D and manufacturing processes are generating large enough amounts of data to identify patterns that reduce costs and development time. However, the complexity intrinsic to the data prevents its full exploitation. Even a domain expert finds it challenging identifying useful patterns.

In material manufacturing, the data typically consists of recipes, processes, aging conditions, and material properties such as ductility or malleability. The high-dimensional space of the problem would need to be adequately sampled—and that’s where the difficulties start. It is rare to have sufficient data samples to characterize the entire space of the material manufacturing problem and this limits the possibility of identifying meaningful patterns.

It’s equivalent to spotting a needle in the materials manufacturing haystack.

AI image processing for materials

We’ve decided to use AI to capture all the data on a specific material development cycle. Inspired by an AI architecture used for image processing, we adopted deep learning models designed to recognize important features in images and applied them to material manufacturing. The use of these architectures allows us to easily identify the most important correlations and patterns.

Learning from image processing, we found that properly trained AI models could recognize and categorize materials data types, similarly to an AI model detecting objects and patterns in images and videos. The model encodes each data type, apart from material properties, into a dedicated latent space using encoder-decoder neural-network structures (Figure 1). The architecture is general to describe a broad variety of scenarios.1

Illustration of how the recipes, process parameters, and test conditions are encoded to the latent space and then reconstructed by a decoder.
Figure 1:
Illustration of how the recipes, process parameters, and test conditions are encoded to the latent space and then reconstructed by a decoder. From the trained latent space, a property can be predicted by an artificial neural network, for example a random forest neural network.

Each encoder converts an input into a fixed sized vector of reduced dimensionality called latent representation. The decoder then reconstructs the input from the latent representation within a minimal reconstruction error. This procedure resembles that of zip files, where files (here, materials properties) are compressed into a single archive (here, a latent representation).

The latent representation forces the deep learning model to learn the important features in the data and minimize the noise. A neural network architecture links the features of latent space to the properties of the materials (Figure 2).

It is also possible to search the latent space to optimize the recipes and processes to obtain a desired set of target property values using a Gaussian process as illustrated.
Figure 2:
It is also possible to search the latent space to optimize the recipes and processes to obtain a desired set of target property values using a Gaussian process as illustrated.

Collaborating with industry

While public materials formulations are nearly non-existent, companies manufacturing materials make materials data part of their core digital business.

It is with such companies that we put our technology to the test. In close collaboration with Evonik, we applied our AI solution to the space of high-performance polymers. Together, we customized the entire AI architecture to predict polymer properties, refine existing formulations as well as generate new ones.

The AI models we designed address three types of predictions: material properties, compositions and processing parameters. Given a list of ingredients and quantities, the model can predict selected mechanical, physical and chemical properties of a product at specific test conditions, such as temperature or other parameters.

The model also takes into account selected process parameters that are part of a manufacturing process. The advantage here is that the predictions serve as a virtual experiment. Starting from ingredients and process parameters, the model predicts quantities that can be directly measured in the product at the end of the manufacturing process.

Essentially, the AI model predictions act as a compass, pointing researchers and chemists towards quicker breakthroughs in discovering new formulations for new materials to create innovative products.

In the best cases, our AI models proved to have an R-squared (R2) of 0.70-0.87 on unseen formulations. R2 is used as a metric of correlation calculating how close the true and the predicted values are. Values close to one indicate high accuracy predictions.

The same AI models can also predict recipes—ingredients and quantities—to produce materials with certain specifications. This should reduce the number of lab experiments by providing a set of predictions to start with.

Such AI architectures are general enough for a broad range of industrial problems. In addition to the work done with Evonik on polymeric materials, we’ve applied this algorithm in various material design processes, including in metallurgic industries for the design of metallic alloys and for the optimization of epoxy-resins. The potential goes beyond the scope of R&D with direct application in the production of materials, provided there are proper data curation processes for AI.

To explore all the possibilities of AI in depth, Evonik has expanded its cooperation with IBM in the field of digitization, and is the first chemical company to be part of the MIT-IBM Watson AI Lab.


28 May 2021





  1. Gaudin, T., Schilter, O., Zipoli, F., et al. Advanced Data-Driven Manufacturing. ERCIM News. 122 (2020).