Publication
MASCOTS 2020
Conference paper

Towards a common environment for learning scheduling algorithms

View publication

Abstract

We propose a way to model and integrate HPC scheduling simulators into a popular Reinforcement Learning toolkit. We show experimentally that such an approach not only aids researchers being able to iterate faster by means of software reuse, but also to achieve state-of-the-art performance with 10x less interactions with the environment. We validate the simulation model's correctness by using unit tests, assertions and experimental comparisons. We also share an open source implementation of the model that will benefit researchers in resource management tasks assisted by Machine Learning.

Date

Publication

MASCOTS 2020