One of the main issues of AI-based retrosynthesis planning algorithms is that, usually, the proposed disconnection strategies lack in diversity. When the goal is to find a suitable set of precursors for a given target molecule, the generated precursors typically fall in the same chemical macro class (ex. protection, deprotection or same C-C bond formation with a slightly different set of reagents) and the automatic synthesis planning tools might get stuck. Most of the previous approaches do not allow a machine learning model to have a broader exploration and are focused on the top single-step predictions, which can be detrimental for the multi-step strategy. To enhance diversity in our approach, we introduced tokens of macro classes in the training inputs. The learned embeddings of the given sample partly codify some characteristics of the reactions belonging to that class. At test time, the macro classes allow us to stir the model towards different kinds of disconnection strategies. In this work, we show with results on a set of patent data that the diversity of the predictions can improve consistently. While the use of excessively specific groupings can decrease the model performances in terms of valid proposed set of precursors, the use of chemically relevant policies to construct smaller macro groups allows to recover the quality of the predictions without the loss of the found diversity.