Publication
MASCOTS 2020
Conference paper
Towards a common environment for learning scheduling algorithms
Abstract
We propose a way to model and integrate HPC scheduling simulators into a popular Reinforcement Learning toolkit. We show experimentally that such an approach not only aids researchers being able to iterate faster by means of software reuse, but also to achieve state-of-the-art performance with 10x less interactions with the environment. We validate the simulation model's correctness by using unit tests, assertions and experimental comparisons. We also share an open source implementation of the model that will benefit researchers in resource management tasks assisted by Machine Learning.