A symbolic emulator for shuffle synthesis on the NVIDIA PTX code

Carregant...
Miniatura
El pots comprar en digital a:
El pots comprar en paper a:

Projectes de recerca

Unitats organitzatives

Número de la revista

Títol de la revista

ISSN de la revista

Títol del volum

Col·laborador

Editor

Tribunal avaluador

Realitzat a/amb

Tipus de document

Text en actes de congrés

Data publicació

Editor

Association for Computing Machinery (ACM)

Condicions d'accés

Accés obert

item.page.rightslicense

Creative Commons
Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Llevat que s'hi indiqui el contrari, els seus continguts estan subjectes a la llicència de Creative Commons: Reconeixement 4.0 Internacional

Assignatures relacionades

Assignatures relacionades

Publicacions relacionades

Datasets relacionats

Datasets relacionats

Projecte CCD

Abstract

Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method that easily enables parallel computing by just adhering code annotations to code loops. Such abstract models, however, often prevent programmers from making additional low-level optimizations to take advantage of the advanced architectural features of GPUs because the actual generated computation is hidden from the application developer. This paper describes and implements a novel flexible optimization technique that operates by inserting a code emulator phase to the tail-end of the compilation pipeline. Our tool emulates the generated code using symbolic analysis by substituting dynamic information and thus allowing for further low-level code optimizations to be applied. We implement our tool to support both CUDA and OpenACC directives as the frontend of the compilation pipeline, thus enabling low-level GPU optimizations for OpenACC that were not previously possible. We demonstrate the capabilities of our tool by automating warp-level shuffle instructions that are difficult to use by even advanced GPU programmers. Lastly, evaluating our tool with a benchmark suite and complex application code, we provide a detailed study to assess the benefits of shuffle instructions across four generations of GPU architectures.

Descripció

Persones/entitats

Document relacionat

Versió de

Citació

Matsumura, K.; García de Gonzalo, S.; Peña, A. A symbolic emulator for shuffle synthesis on the NVIDIA PTX code. A: International Conference on Compiler Construction. "CC'23: proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction: February 25–26, 2023, Montréal, QC, Canada". New York: Association for Computing Machinery (ACM), 2023, p. 110-121. ISBN 979-8-4007-0088-0. DOI 10.1145/3578360.3580253.

Ajut

Forma part

Dipòsit legal

ISBN

979-8-4007-0088-0

ISSN

Altres identificadors

Referències