About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ACS Spring 2022
Poster
Molecular transformer-aided biocatalysed synthesis planning
Abstract
Enzyme catalysts are an integral part of green chemistry strategies towards a more sustainable and resource-efficient chemical synthesis. However, the retrosynthesis of given targets with biocatalysed reactions remains a significant challenge: the substrate specificity, the potential to catalyse unreported substrates, and the specific stereo- and regioselectivity properties are domain-specific knowledge factors that hinders the adoption of biocatalysis in daily laboratory works. Here, we use the molecular transformer architecture to capture the latent knowledge about enzymatic activity from a large data set of publicly available enzymatic data, extending forward reaction and retrosynthetic pathway prediction to the domain of biocatalysis. We introduce a class token based on the EC classification scheme that allows to capture catalysis patterns among different enzymes belonging to same hierarchical families. The forward prediction model achieves a top-5 accuracy of 62.7%, while the single step retrosynthetic model shows a top-1 round-trip accuracy of 39.6%. The enzymatic data and the trained models are available through the RXN for Chemistry network (https://rxn.res.ibm.com and https://github.com/rxn4chemistry).