Dynamic load balancing in GPU-based systems for a MPI program

Alvaro Luiz Fazenda; Celso L. Mendes; Laxmikant V. Kale; Jairo Panetta; Eduardo Rocha Rodrigues

doi:10.1109/HPCSim.2014.6903681

HPCS 2014

Conference paper

18 Sep 2014

Dynamic load balancing in GPU-based systems for a MPI program

View publication

Abstract

The dynamic load-balancing framework Charm++/AMPI, developed at the University of Illinois, is based on processor virtualization to allow thread migration across processors. This framework has been successfully applied to many scientific applications in the past, such as BRAMS, NAMD, ChaNGa, and others. Most of these applications use only CPUs, that is, they do not use accelerators. However, the use of GPUs to improve computational performance is quickly getting massively disseminated in the high-performance computing community. This paper aims to investigate how the same Charm++/AMPI framework can be extended to balance load in a synthetic application inspired by the BRAMS numerical forecast model, running on GPUs instead of CPUs. Many major questions involving the use of GPUs with AMPI where handled in this work, including: how to measure the GPU's load, how to use and share GPUs among user-level threads, and what results are obtained when applying the required over-decomposition technique to a GPU-accelerated program.

Paper