JACC: An OpenACC runtime framework with kernel-level and multi-GPU parallelization

View/Open
Cita com:
hdl:2117/364396
Document typeConference report
Defense date2021
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
The rapid development in computing technology has paved the way for directive-based programming models towards a principal role in maintaining software portability of performance-critical applications. Efforts on such models involve a least engineering cost for enabling computational acceleration on multiple architectures while programmers are only required to add meta information upon sequential code. Optimizations for obtaining the best possible efficiency, however, are often challenging. The insertions of directives by the programmer can lead to side-effects that limit the available compiler optimization possible, which could result in performance degradation. This is exacerbated when targeting multi-GPU systems, as pragmas do not automatically adapt to such systems, and require expensive and time consuming code adjustment by programmers. This paper introduces JACC, an OpenACC runtime framework which enables the dynamic extension of OpenACC programs by serving as a transparent layer between the program and the compiler. We add a versatile code-translation method for multi-device utilization by which manually-optimized applications can be distributed automatically while keeping original code structure and parallelism. We show in some cases nearly linear scaling on the part of kernel execution with the NVIDIA V100 GPUs. While adaptively using multi-GPUs, the resulting performance improvements amortize the latency of GPU-to-GPU communications.
CitationMatsumura, K.; García de Gonzalo, S.; Peña, A. JACC: An OpenACC runtime framework with kernel-level and multi-GPU parallelization. A: IEEE International Conference on High Performance Computing, Data, and Analytics. "2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021: 17-18 December 2021, Bangalore, India (virtual event): proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 182-191. ISBN 978-1-6654-1016-8. DOI 10.1109/HiPC53243.2021.00032.
ISBN978-1-6654-1016-8
Publisher versionhttps://ieeexplore.ieee.org/document/9680346
Files | Description | Size | Format | View |
---|---|---|---|---|
camera-ready.pdf | 1,459Mb | View/Open |