The secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT

Llort, German; Filgueras Izquierdo, Antonio; Jiménez-González, Daniel; Servat, Harald; Teruel, Xavier; Mercadal, Estanislao; Álvarez, Carlos; Giménez, Judit; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José

doi:10.1007/978-3-319-45550-1_16

Visualitza/Obre

The secrets of the accelerators unveiled.pdf (1,039Mb) (Accés restringit) Sol·licita una còpia a l'autor

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Tipus de documentText en actes de congrés

Data publicació2016

EditorSpringer

Condicions d'accésAccés restringit per política de l'editorial

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

ProjecteAXIOM - Agile, eXtensible, fast I%2FO Module for the cyber-physical era (EC-H2020-645496)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
AXIOM - Agile, eXtensible, fast I%2FO Module for the cyber-physical era (EC-H2020-645496)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)

Abstract

Heterogeneous systems are an important trend in the future of supercomputers, yet they can be hard to program and developers still lack powerful tools to gain understanding about how well their accelerated codes perform and how to improve them. Having different types of hardware accelerators available, each with their own specific low-level APIs to program them, there is not yet a clear consensus on a standard way to retrieve information about the accelerator’s performance. To improve this scenario, OMPT is a novel performance monitoring interface that is being considered for integration into the OpenMP standard. OMPT allows analysis tools to monitor the execution of parallel OpenMP applications by providing detailed information about the activity of the runtime through a standard API. For accelerated devices, OMPT also facilitates the exchange of performance information between the runtime and the analysis tool. We implement part of the OMPT specification that refers to the use of accelerators both in the Nanos++ parallel runtime system and the Extrae tracing framework, obtaining detailed performance information about the execution of the tasks issued to the accelerated devices to later conduct insightful analysis. Our work extends previous efforts in the field to expose detailed information from the OpenMP and OmpSs runtimes, regarding the activity and performance of task-based parallel applications. In this paper, we focus on the evaluation of FPGA devices studying the performance of two common kernels in scientific algorithms: matrix multiplication and Cholesky decomposition. Furthermore, this development is seamlessly applicable for the analysis of GPGPU accelerators and Intel®Xeon PhiTM co-processors operating under the OmpSs programming model.

CitacióLlort, G., Filgueras, A., Jiménez-González, D., Servat, H., Teruel, X., Mercadal, E., Álvarez, C., Giménez, J., Martorell, X., Ayguade, E., Labarta, J. The secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT. A: International Workshop on OpenMP. "OpenMP: memory, devices, and tasks: 12th International Workshop on OpenMP: IWOMP 2016: Nara, Japan: October 5-7, 2016: proceedings". Nara: Springer, 2016, p. 217-236.

URIhttp://hdl.handle.net/2117/91298

DOI10.1007/978-3-319-45550-1_16

ISBN978-3-319-45549-5

Versió de l'editorhttp://link.springer.com/chapter/10.1007%2F978-3-319-45550-1_16

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
The secrets of the accelerators unveiled.pdf		1,039Mb	PDF	Accés restringit

UPCommons. Portal del coneixement obert de la UPC

The secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT

Visualitza/Obre

Explora