Task-based programming in COMPSs to converge from HPC to big data

Conejero, Javier; Corella, Sandra; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José

doi:10.1177/1094342017701278

Visualitza/Obre

Task-based programming in COMPSs to converge from HPC to Big Data.pdf (709,8Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Conejero, Javier

Corella, Sandra

Badia Sala, Rosa Maria

Labarta Mancho, Jesús José

Tipus de documentArticle

Data publicació2017-04-06

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

ProjecteFJCI-2015-24651 (MINECO-FJCI-2015-24651)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
HBP - The Human Brain Project (EC-FP7-604102)

Abstract

Task-based programming has proven to be a suitable model for high-performance computing (HPC) applications. Different implementations have been good demonstrators of this fact and have promoted the acceptance of task-based programming in the OpenMP standard. Furthermore, in recent years, Apache Spark has gained wide popularity in business and research environments as a programming model for addressing emerging big data problems. COMP Superscalar (COMPSs) is a task-based environment that tackles distributed computing (including Clouds) and is a good alternative for a task-based programming model for big data applications. This article describes why we consider that task-based programming models are a good approach for big data applications. The article includes a comparison of Spark and COMPSs in terms of architecture, programming model, and performance. It focuses on the differences that both frameworks have in structural terms, on their programmability interface, and in terms of their efficiency by means of three widely known benchmarking kernels: Wordcount, Kmeans, and Terasort. These kernels enable the evaluation of the more important functionalities of both programming models and analyze different work flows and conditions. The main results achieved from this comparison are (1) COMPSs is able to extract the inherent parallelism from the user code with minimal coding effort as opposed to Spark, which requires the existing algorithms to be adapted and rewritten by explicitly using their predefined functions, (2) it is an improvement in terms of performance when compared with Spark, and (3) COMPSs has shown to scale better than Spark in most cases. Finally, we discuss the advantages and disadvantages of both frameworks, highlighting the differences that make them unique, thereby helping to choose the right framework for each particular objective.

CitacióConejero, J., Corella, S., Badia, R. M., Labarta, J. Task-based programming in COMPSs to converge from HPC to big data. "International journal of high performance computing applications", 1 Gener 2018, vol. 32, núm. 1, p. 45-60.

URIhttp://hdl.handle.net/2117/104954

DOI10.1177/1094342017701278

ISSN1094-3420

Versió de l'editorhttp://journals.sagepub.com/doi/pdf/10.1177/1094342017701278

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Task-based prog ... e from HPC to Big Data.pdf		709,8Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Task-based programming in COMPSs to converge from HPC to big data

Visualitza/Obre

Explora