Leveraging task-parallelism in message-passing dense matrix factorizations using SMPSs

Martín Huertas, Alberto Francisco; Reyes, Ruyman; Badia Sala, Rosa Maria; Quintana Ortí, Enrique Salvador

doi:10.1016/j.parco.2014.04.001

dc.contributor.author	Martín Huertas, Alberto Francisco
dc.contributor.author	Reyes, Ruyman
dc.contributor.author	Badia Sala, Rosa Maria
dc.contributor.author	Quintana Ortí, Enrique Salvador
dc.contributor.other	Universitat Politècnica de Catalunya. Departament de Resistència de Materials i Estructures a l'Enginyeria
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.other	Barcelona Supercomputing Center
dc.date.accessioned	2014-07-10T09:11:26Z
dc.date.available	2014-07-10T09:11:26Z
dc.date.created	2014-05
dc.date.issued	2014-05
dc.identifier.citation	Martín, A. F. [et al.]. Leveraging task-parallelism in message-passing dense matrix factorizations using SMPSs. "Parallel computing", Maig 2014, vol. 40, núm. 5-6, p. 113-128.
dc.identifier.issn	0167-8191
dc.identifier.uri	http://hdl.handle.net/2117/23465
dc.description.abstract	In this paper, we investigate how to exploit task-parallelism during the execution of the Cholesky factorization on clusters of multicore processors with the SMPSs programming model. Our analysis reveals that the major difficulties in adapting the code for this operation in ScaLAPACK to SMPSs lie in algorithmic restrictions and the semantics of the SMPSs programming model, but also that they both can be overcome with a limited programming effort. The experimental results report considerable gains in performance and scalability of the routine parallelized with SMPSs when compared with conventional approaches to execute the original ScaLAPACK implementation in parallel as well as two recent message-passing routines for this operation. In summary, our study opens the door to the possibility of reusing message-passing legacy codes/libraries for linear algebra, by introducing up-to-date techniques like dynamic out-of-order scheduling that significantly upgrade their performance, while avoiding a costly rewrite/reimplementation.
dc.description.sponsorship	This research was supported by Project EU INFRA-2010-1.2.2 \TEXT:Towards EXa op applicaTions". The researcher at BSC-CNS was supported by the HiPEAC-2 Network of Excellence (FP7/ICT 217068), the Spanish Ministry of Education (CICYT TIN2011-23283, TIN2007-60625 and CSD2007- 00050), and the Generalitat de Catalunya (2009-SGR-980). The researcher at CIMNE was partially funded by the UPC postdoctoral grants under the programme \BKC5-Atracció i Fidelització de talent al BKC". The researcher at UJI was supported by project CICYT TIN2008-06570-C04-01 and FEDER. We thank Jesus Labarta, from BSC-CNS, for helpful discussions on SMPSs and his help with the performance analysis of the codes with Paraver. We thank Vladimir Marjanovic, also from BSC-CNS, for his help in the set-up and tuning of the MPI/SMPSs tools on JuRoPa. Finally, we thank Rafael Mayo, from UJI, for his support in the preliminary stages of this work. The authors gratefully acknowledge the computing time granted on the supercomputer JuRoPa at Jülich Supercomputing Centrer.
dc.format.extent	16 p.
dc.language.iso	eng
dc.rights	Attribution-NonCommercial-NoDerivs 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcsh	Parallel computation
dc.subject.other	Clusters of multi-core processors
dc.subject.other	Linear algebra
dc.subject.other	Message-passing numerical libraries
dc.subject.other	Task parallelism
dc.title	Leveraging task-parallelism in message-passing dense matrix factorizations using SMPSs
dc.type	Article
dc.subject.lemac	Computació paralel.la
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi	10.1016/j.parco.2014.04.001
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	http://www.sciencedirect.com/science/article/pii/S0167819114000441
dc.rights.access	Open Access
local.identifier.drac	14920725
dc.description.version	Preprint
dc.relation.projectid	info:eu-repo/grantAgreement/MINECO/6PN/TIN2011-23283
dc.relation.projectid	info:eu-repo/grantAgreement/MEC//TIN2007-60625/ES/COMPUTACION DE ALTAS PRESTACIONES V/
dc.relation.projectid	info:eu-repo/grantAgreement/EC/FP7/217068/EU/High Performance and Embedded Architecture and Compilation/HIPEAC
local.citation.author	Martín, A. F.; Reyes, R.; Badia, R.M.; Quintana, E.
local.citation.publicationName	Parallel computing
local.citation.volume	40
local.citation.number	5-6
local.citation.startingPage	113
local.citation.endingPage	128

Fitxers d'aquest items

Nom:: mpi_smpss_scalapack[1].pdf
Mida:: 606,2Kb
Format:: PDF
Descripció:: Pre-print

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Leveraging task-parallelism in message-passing dense matrix factorizations using SMPSs

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora