OpenMP to CUDA graphs: a compiler-based transformation to enhance the programmability of NVIDIA devices
Document typeConference lecture
PublisherAssociation for Computing Machinery (ACM)
Rights accessOpen Access
European Commission's projectAMPERE - A Model-driven development framework for highly Parallel and EneRgy-Efficient computation supporting multi-criteria optimisation (EC-H2020-871669)
Heterogeneous computing is increasingly being used in a diversity of computing systems, ranging from HPC to the real-time embedded domain, to cope with the performance requirements. Due to the variety of accelerators, e.g., FPGAs, GPUs, the use of high-level parallel programming models is desirable to exploit the performance capabilities of them, while maintaining an adequate productivity level. In that regard, OpenMP is a well-known high-level programming model that incorporates powerful task and accelerator models capable of efficiently exploiting structured and unstructured parallelism in heterogeneous computing. This paper presents a novel compiler transformation technique that automatically transforms OpenMP code into CUDA graphs, combining the benefits of programmability of a high-level programming model such as OpenMP, with the performance benefits of a low-level programming model such as CUDA. Evaluations have been performed on two NVIDIA GPUs from the HPC and embedded domains, i.e., the V100 and the Jetson AGX respectively.
CitationYu, C.; Royuela Alcázar, S.; Quiñones, E. OpenMP to CUDA graphs: a compiler-based transformation to enhance the programmability of NVIDIA devices. A: SCOPES: Software and Compilers for Embedded Systems Conference. "SCOPES '20: Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems: May 2020". New York, NY, USA: Association for Computing Machinery (ACM), 2020, p. 42-47.