AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python
View/Open
Cita com:
hdl:2117/328829
Document typeArticle
Defense date2020
PublisherSage
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION (MINECO-SEV-2015-0493)
Abstract
The last improvements in programming languages and models have focused on simplicity and abstraction; leading Python to the top of the list of the programming languages. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelisation of affine loop nests and execute them in parallel in a distributed computing infrastructure. It is based on sequential programming and contains one single annotation (in the form of a Python decorator) so that anyone with intermediate-level programming skills can scale up an application to hundreds of cores. The evaluation demonstrates that AutoParallel goes one step further in easing the development of distributed applications. On the one hand, the programmability evaluation highlights the benefits of using a single Python decorator instead of manually annotating each task and its parameters or, even worse, having to develop the parallel code explicitly (e.g., using OpenMP, MPI). On the other hand, the performance evaluation demonstrates that AutoParallel is capable of automatically generating task-based workflows from sequential Python code while achieving the same performances than manually taskified versions of established state-of-the-art algorithms (i.e., Cholesky, LU, and QR decompositions). Finally, AutoParallel is also capable of automatically building data blocks to increase the tasks’ granularity; freeing the user from creating the data chunks, and re-designing the algorithm. For advanced users, we believe that this feature can be useful as a baseline to design blocked algorithms.
CitationRamón-cortés, C. [et al.]. AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python. "The international journal of high performance computing applications (IJHPCA)", 2020, vol. 34, núm. 6, p.659-675.
ISSN1741-2846
Publisher versionhttps://doi.org/10.1177/1094342020937050
Collections
- Computer Sciences - Articles de revista [341]
- Departament d'Arquitectura de Computadors - Articles de revista [1.098]
- Doctorat en Bioinformàtica - Articles de revista [12]
- CAP - Grup de Computació d'Altes Prestacions - Articles de revista [382]
- Doctorat en Arquitectura de Computadors - Articles de revista [181]
Files | Description | Size | Format | View |
---|---|---|---|---|
AutoParallel_IJHPCA-1.pdf | 927,5Kb | View/Open |