Asynchronous runtime with distributed manager for task-based programming models
Visualitza/Obre
10.1016/j.parco.2020.102664
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/330058
Tipus de documentArticle
Data publicació2020-09
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 4.0 Internacional
ProjecteCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
EPEEC - European joint Effort toward a Highly Productive Programming Environment for Heterogeneous Exascale Computing (EPEEC) (EC-H2020-801051)
EuroEXA - Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EC-H2020-754337)
LEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
EPEEC - European joint Effort toward a Highly Productive Programming Environment for Heterogeneous Exascale Computing (EPEEC) (EC-H2020-801051)
EuroEXA - Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EC-H2020-754337)
LEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
Abstract
Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per task that the runtime uses to order the tasks execution. This order is calculated using shared graphs, which are updated by all threads in exclusive access using synchronization mechanisms (locks) to ensure the dependence management correctness. The contention in the access to these structures becomes critical in many-core systems because several threads may be wasting computation resources waiting their turn. This paper proposes an asynchronous management of the runtime structures, like task dependence graphs, suitable for task-based programming model runtimes. In such organization, the threads request actions to the runtime instead of doing them directly. The requests are then handled by a distributed runtime manager (DDAST) which does not require dedicated resources. Instead, the manager uses the idle threads to modify the runtime structures. The paper also presents an implementation, analysis and performance evaluation of such runtime organization. The performance results show that the proposed asynchronous organization outperforms the speedup obtained by the original runtime for different benchmarks and different many-core architectures.
CitacióBosch, J. [et al.]. Asynchronous runtime with distributed manager for task-based programming models. "Parallel computing", Setembre 2020, vol. 97, article 102664, p. 1-35.
ISSN0167-8191
Versió de l'editorhttps://www.sciencedirect.com/science/article/pii/S0167819120300570
Altres identificadorshttps://arxiv.org/abs/2009.03066
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
Bosch.pdf | 1,062Mb | Visualitza/Obre |