Pavithra Harsha, Ashish Jagmohan, et al.
INFORMS 2021
In this work, we discuss Programmable Actor Reinforcement Learning (PARL), a policy iteration method that uses techniques from integer programming and sample average approximation. We numerically benchmark the algorithm in complex supply chain settings where optimal solution is intractable and show its performs comparable to, and sometimes better than, state-of-the-art RL and commonly used inventory management benchmarks.
Pavithra Harsha, Ashish Jagmohan, et al.
INFORMS 2021
Kevin Tang, Yihua Li, et al.
INFORMS 2022
Rares Christian, Pavithra Harsha, et al.
INFORMS 2022
Krishnasuri Narayanam, Pankaj Dayama, et al.
ICBC 2022