Automated Derivation Of MDP And Reinforcement Learning Models From Historical Data
While optimization models can provide immense value, creating such models requires substantial time and expertise. In this presentation, using inventory replenishment optimization as a running example, we will describe how sequential discrete time optimization models can be automatically generated through a combination of Markov Decision Process and Reinforcement Learning models. Our goals are to significantly reduce the time and skills for such model creation, thereby making the benefits of optimization much more widely available. We will also demonstrate the application of our work to the supply chain inventory management problem.