Framework for a productive performance optimization
Visualitza/Obre
Framework for a productive performance optimization (2,156Mb) (Accés restringit)
Sol·licita una còpia a l'autor
Què és aquest botó?
Aquest botó permet demanar una còpia d'un document restringit a l'autor. Es mostra quan:
- Disposem del correu electrònic de l'autor
- El document té una mida inferior a 20 Mb
- Es tracta d'un document d'accés restringit per decisió de l'autor o d'un document d'accés restringit per política de l'editorial
10.1016/j.parco.2013.05.004
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/20012
Tipus de documentArticle
Data publicació2013-08
Condicions d'accésAccés restringit per política de l'editorial
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
ProjectePRACE-2IP - PRACE (EC-FP7-283493)
HIPEAC - High Performance and Embedded Architecture and Compilation (EC-FP7-287759)
HIPEAC - High Performance and Embedded Architecture and Compilation (EC-FP7-287759)
Abstract
Modern supercomputers deliver large computational power, but it is difficult for an application to exploit such power. One factor that limits the application performance is the single node performance. While many performance tools use the microprocessor performance counters to provide insights on serial node performance issues, the complex semantics of these counters pose an obstacle to an inexperienced developer.
We present a framework that allows easy identification and qualification of serial node performance bottlenecks in parallel applications. The output of the framework is precise and it is capable of correlating performance inefficiencies with small regions of code within the application. The framework not only points to regions of code but also simplifies the semantics of the performance counters into metrics that refer to processor functional units. With such information the developer can focus on the identified code and improve it by knowing which processor execution unit is degrading the performance. To demonstrate the usefulness of the framework we apply it to three already optimized applications using realistic inputs and, according to the results, modify their source code. By doing modifications that require little effort, we successfully increase the applications’ performance from 10% to 30% and thus shorten the time required to reach the solution and/or allow facing increased problem sizes.
CitacióServat, H. [et al.]. Framework for a productive performance optimization. "Parallel computing", Agost 2013, vol. 39, núm. 8, p. 336-353.
ISSN0167-8191
Versió de l'editorhttp://www.sciencedirect.com/science/article/pii/S0167819113000707
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
1-s2.0-S0167819113000707-main.pdf | Framework for a productive performance optimization | 2,156Mb | Accés restringit |