Integrating GPU support for OpenMP offloading directives into clang

Carlo Bertolli; Samuel F. Antao; Gheorghe-Teodor Bercea; Arpith Chacko Jacob; Alexandre E. Eichenberger; Tong Chen; Zehra Sura; Hyojin Sung; Georgios Rokos; David Appelhans; Kevin O&#039;Brien

doi:10.1145/2833157.2833161

LLVM-HPC 2015

Conference paper

15 Nov 2015

Integrating GPU support for OpenMP offloading directives into clang

View publication

Abstract

The LLVM community is currently developing OpenMP 4.1 support, consisting of software improvements for Clang and new runtime libraries. OpenMP 4.1 includes offloading constructs that permit execution of user selected regions on generic devices, external to the main host processor. This paper describes our ongoing work towards delivering support for OpenMP offloading constructs for the OpenPower system into the LLVM compiler infrastructure. We previously introduced a design for a control loop scheme necessary to implement the OpenMP generic offloading model on NVIDIA GPUs. In this paper we show how we integrated the complexity of the control loop into Clang by limiting its support to OpenMP-related functionality. We also synthetically report the results of performance analysis on benchmarks and a complex application kernel. We show an optimization in the Clang code generation scheme for specific code patterns, alternative to the control loop, which delivers improved performance.

Conference paper