Combining dynamic concurrency throttling with voltage and frequency scaling on task-based programming models
Visualitza/Obre
Cita com:
hdl:2117/353698
Tipus de documentText en actes de congrés
Data publicació2021
EditorAssociation for Computing Machinery (ACM)
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
ProjecteDEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (EC-H2020-955606)
HPC-EUROPA3 - Transnational Access Programme for a Pan-European Network of HPC Research Infrastructures and Laboratories for scientific computing (EC-H2020-730897)
HPC-EUROPA3 - Transnational Access Programme for a Pan-European Network of HPC Research Infrastructures and Laboratories for scientific computing (EC-H2020-730897)
Abstract
Being on the verge of exascale performance has shifted the prioritization of performance in applications to the inclusion of power-performance efficiency as a primary objective in the High Performance Computing (HPC) community. Simultaneously, this has surfaced hardware and software efforts that employ techniques such as dynamic voltage and frequency scaling (DVFS) for core and uncore units or dynamic concurrency throttling (DCT) to exploit hardware resources efficiently, by saving energy while maintaining performance. These techniques are complementary, so they can be used together. However, employing them is not a straightforward task, as they have to be adjusted based on the workload, and it is even more complex to combine them properly. Thus, these techniques should be applied transparently by a runtime system, without relying on application developers. In this paper, we extend a task-based runtime system with an infrastructure that categorizes workloads based on their computational profile – memory-bounded, compute-bounded, or balanced. This categorization is done in an on-line manner and with a negligible overhead. With this additional information, we enhance the CPU-manager and scheduler of OmpSs-2, a task-based parallel programming model, to automatically combine DVFS and DCT techniques based on workloads. Moreover, we show that our heuristics transparently improve energy efficiency on average by 15% with no significant performance loss and either equal or surpass the energy efficiency of the best static configuration available.
CitacióNavarro, A. [et al.]. Combining dynamic concurrency throttling with voltage and frequency scaling on task-based programming models. A: International Conference on Parallel Processing. "The 50th International Conference on Parallel Processing: August 9-12, 2021, hosted virtually from Chicago, Illinois, USA: main conference proceedings". New York: Association for Computing Machinery (ACM), 2021, p. 1-11. ISBN 978-1-4503-9068-2. DOI 10.1145/3472456.3472471.
ISBN978-1-4503-9068-2
Versió de l'editorhttps://dl.acm.org/doi/10.1145/3472456.3472471
Col·leccions
- Doctorat en Arquitectura de Computadors - Ponències/Comunicacions de congressos [282]
- Computer Sciences - Ponències/Comunicacions de congressos [560]
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos [784]
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.945]
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
2021_ICPP_Toni_NOACM.pdf | 826,2Kb | Visualitza/Obre |