Asynchronous runtime with distributed manager for task-based programming models

View/Open
Cita com:
hdl:2117/330058
Document typeArticle
Defense date2020-09
Rights accessOpen Access
This work is protected by the corresponding intellectual and industrial property rights.
Except where otherwise noted, its contents are licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 4.0 International
ProjectCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
EPEEC - European joint Effort toward a Highly Productive Programming Environment for Heterogeneous Exascale Computing (EPEEC) (EC-H2020-801051)
EuroEXA - Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EC-H2020-754337)
LEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
EPEEC - European joint Effort toward a Highly Productive Programming Environment for Heterogeneous Exascale Computing (EPEEC) (EC-H2020-801051)
EuroEXA - Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EC-H2020-754337)
LEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
Abstract
Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per task that the runtime uses to order the tasks execution. This order is calculated using shared graphs, which are updated by all threads in exclusive access using synchronization mechanisms (locks) to ensure the dependence management correctness. The contention in the access to these structures becomes critical in many-core systems because several threads may be wasting computation resources waiting their turn. This paper proposes an asynchronous management of the runtime structures, like task dependence graphs, suitable for task-based programming model runtimes. In such organization, the threads request actions to the runtime instead of doing them directly. The requests are then handled by a distributed runtime manager (DDAST) which does not require dedicated resources. Instead, the manager uses the idle threads to modify the runtime structures. The paper also presents an implementation, analysis and performance evaluation of such runtime organization. The performance results show that the proposed asynchronous organization outperforms the speedup obtained by the original runtime for different benchmarks and different many-core architectures.
CitationBosch, J. [et al.]. Asynchronous runtime with distributed manager for task-based programming models. "Parallel computing", Setembre 2020, vol. 97, article 102664, p. 1-35.
ISSN0167-8191
Publisher versionhttps://www.sciencedirect.com/science/article/pii/S0167819120300570
Other identifiershttps://arxiv.org/abs/2009.03066