Optimizing the SpMV kernel on long-vector accelerators

Document typeConference report
Defense date2021-05
PublisherBarcelona Supercomputing Center
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Sparse Matrix-Vector multiplication (SpMV) is an essential
kernel for parallel numerical applications. SpMV displays
sparse and irregular data accesses, which complicate its vectorization.
Such difficulties make SpMV to frequently experiment
non-optimal results when run on long vector ISAs exploiting
SIMD parallelism. In this context, the development of new optimizations
becomes fundamental to enable high performance
SpMV executions on emerging long vector architectures. In our
work, we improve the state-of-the-art SELL-C- sparse matrix
format by proposing several new optimizations for SpMV.
We target aggressive long vector architectures like the NEC
Vector Engine. By combining several optimizations, we obtain
an average 12% improvement over SELL-C- considering a
heterogeneous set of 24 matrices. Our optimizations boost
performance in long vector architectures since they expose a
high degree of SIMD parallelism.
CitationGómez Crespo, C. [et al.]. Optimizing the SpMV kernel on long-vector accelerators. A: . Barcelona Supercomputing Center, 2021, p. 30-31.
Files | Description | Size | Format | View |
---|---|---|---|---|
BSC_DS-2021-06_Optimizing the SpMV kernel.pdf | 789,8Kb | View/Open |