Efficient fork-join on GPUs through warp specializationArpith Chacko JacobAlexandre E. Eichenbergeret al.2017HiPC 2017
Implementing implicit OpenMP data sharing on GPUsGheorghe-Teodor BerceaCarlo Bertolliet al.2017LLVM-HPC/SC 2017
An open-source solution to performance portability for Summit and Sierra supercomputersGheorghe-Teodor BerceaA. Bataevet al.2020IBM J. Res. Dev
Hybrid CPU/GPU tasks optimized for concurrency in OpenMPAlexandre E. EichenbergerGheorghe-Teodor Berceaet al.2020IBM J. Res. Dev