Show simple item record

dc.contributor.authorGómez Crespo, Constantino
dc.contributor.authorMantovani, Filippo
dc.contributor.authorFocht, Erich
dc.contributor.authorCasas Guix, Marc
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.identifier.citationGómez, C. [et al.]. Efficiently running SpMV on long vector architectures. A: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. "PPoPP'21: proceedings of the 2021 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming: February 27-March 3, 2021, Virtual Event, Republic of Korea". New York: Association for Computing Machinery (ACM), 2021, p. 292-303. ISBN 978-1-4503-8294-6. DOI 10.1145/3437801.3441592.
dc.description.abstractSparse Matrix-Vector multiplication (SpMV) is an essential kernel for parallel numerical applications. SpMV displays sparse and irregular data accesses, which complicate its vectorization. Such difficulties make SpMV to frequently experiment non-optimal results when run on long vector ISAs exploiting SIMD parallelism. In this context, the development of new optimizations becomes fundamental to enable high performance SpMV executions on emerging long vector architectures. In this paper, we improve the state-of-the-art SELL-C-s sparse matrix format by proposing several new optimizations for SpMV. We target aggressive long vector architectures like the NEC Vector Engine. By combining several optimizations, we obtain an average 12% improvement over SELL-C-s considering a heterogeneous set of 24 matrices. Our optimizations boost performance in long vector architectures since they expose a high degree of SIMD parallelism.
dc.description.sponsorshipThe authors would like to acknowledge the support of NEC Corporation. This work is partially supported by the Spanish Ministry of Science and Technology through PID2019-107255GB project and by the Generalitat de Catalunya (contract 2017-SGR-1414). Marc Casas has been partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal fellowship number RYC-2017-23269.
dc.format.extent12 p.
dc.publisherAssociation for Computing Machinery (ACM)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherNEC vector engine
dc.subject.otherLong-vector architectures
dc.subject.otherPerformance optimization
dc.titleEfficiently running SpMV on long vector architectures
dc.typeConference report
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.description.peerreviewedPeer Reviewed
dc.rights.accessRestricted access - publisher's policy
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/AGAUR/2017 SGR 1414
local.citation.authorGómez, C.; Mantovani, F.; Focht, E.; Casas, M.
local.citation.contributorACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
local.citation.pubplaceNew York
local.citation.publicationNamePPoPP’21: proceedings of the 2021 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming: February 27-March 3, 2021, Virtual Event, Republic of Korea

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder