A substantial increase in the availability of computational resources and open access chemical data sets has led to a rise of interest in computer-aided synthesis planning and the construction of biochemical reaction networks. Both problems can be reduced to the same problem of predicting the correct product given substrates and catalysts and vice versa. Current methods approaching this problem are commonly categorized into rule-based expert systems and data-driven machine learning approaches. Rule-based expert systems, constructed using manually created and curated reaction rules, rely on the inputs of knowledgeable chemists or biochemists to define said rules. On the other hand, data-driven machine learning approaches are based on complex models that require large amounts of data as input. While the former are examples of symbolic AI and are well-interpretable but scale poorly, the latter are commonly subsymbolic approaches that scale well but are hard to interpret. Here, we propose bridging the two categories of systems by inferring reaction rules from a transformer model trained on a combination of organic and biocatalysed reaction data sets. We convert each reaction's inferred atom mapping into a corresponding BNICE reaction rule by extracting the transformer encoder model's attention weights. We then create a small number of generalized consensus rules based on this exhaustive set of reaction rules. Finally, we show that these consensus rules can be used to mimic an expert-curated rule-based system without requiring the manual curation of rules by experts. This method enables the symbiosis of rule-based expert systems and data-driven machine learning approaches for both computer-aided synthesis planning and the construction of biochemical reaction networks.