Using Arm’s scalable vector extension on stencil codes

Armejach Sanosa, Adrià; Caminal Pallarés, Helena; Cebrián González, Juan Manuel; Langarita, Rubén; González-Alberquilla, Rekai; Adeniyi-Jones, Chris; Valero Cortés, Mateo; Casas, Marc; Moretó Planas, Miquel

doi:10.1007/s11227-019-02842-5

dc.contributor.author	Armejach Sanosa, Adrià
dc.contributor.author	Caminal Pallarés, Helena
dc.contributor.author	Cebrián González, Juan Manuel
dc.contributor.author	Langarita, Rubén
dc.contributor.author	González-Alberquilla, Rekai
dc.contributor.author	Adeniyi-Jones, Chris
dc.contributor.author	Valero Cortés, Mateo
dc.contributor.author	Casas, Marc
dc.contributor.author	Moretó Planas, Miquel
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2019-10-08T07:00:16Z
dc.date.available	2020-04-08T00:25:58Z
dc.date.issued	2020-03
dc.identifier.citation	Armejach, A. [et al.]. Using Arm’s scalable vector extension on stencil codes. "Journal of supercomputing", vol. 76, Març 2020, p. 2039-2062.
dc.identifier.issn	0920-8542
dc.identifier.uri	http://hdl.handle.net/2117/169340
dc.description.abstract	Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabilities, it can provide substantial performance improvements on top of widely used techniques such as thread-level parallelism. However, manual vectorization is a tedious and costly process that needs to be repeated for each specific instruction set or register size. In addition, automatic compiler vectorization is susceptible to code complexity, and usually limited due to data and control dependencies. To address some of these issues, Arm recently released a new vector ISA, the scalable vector extension (SVE), which is vector-length agnostic (VLA). VLA enables the generation of binary files that run regardless of the physical vector register length. In this paper, we leverage the main characteristics of SVE to implement and optimize stencil computations, ubiquitous in scientific computing. We show that SVE enables easy deployment of textbook optimizations like loop unrolling, loop fusion, load trading or data reuse. Our detailed simulations using vector lengths ranging from 128 to 2048 bits show that these optimizations can lead to performance improvements over straightforward vectorized code of up to 1.57×. In addition, we show that certain optimizations can hurt performance due to reduced arithmetic intensity and instruction overheads, and provide insight useful for compiler optimizers.
dc.format.extent	24 p.
dc.language.iso	eng
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcsh	Ubiquitous computing
dc.subject.lcsh	Compilers (Computer programs)
dc.subject.other	Data-level parallelism
dc.subject.other	Scalable vector extension
dc.subject.other	Vector-length agnostic
dc.subject.other	Stencil computations
dc.title	Using Arm’s scalable vector extension on stencil codes
dc.type	Article
dc.subject.lemac	Informàtica ubiqua
dc.subject.lemac	Compiladors (Programes d'ordinador)
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi	10.1007/s11227-019-02842-5
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	https://link.springer.com/article/10.1007/s11227-019-02842-5
dc.rights.access	Open Access
local.identifier.drac	25169572
dc.description.version	Postprint (author's final draft)
dc.relation.projectid	info:eu-repo/grantAgreement/AEI/RYC-2016-21104
dc.relation.projectid	info:eu-repo/grantAgreement/AGAUR/2017 SGR 1414
dc.relation.projectid	info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
local.citation.author	Armejach, A.; Caminal, H.; Cebrián, J. M.; Langarita, R.; González-Alberquilla, R.; Adeniyi-Jones, C.; Valero, M.; Casas, M.; Moreto, M.
local.citation.publicationName	Journal of supercomputing
local.citation.volume	76
local.citation.startingPage	2039
local.citation.endingPage	2062

Fitxers d'aquest items

Nom:: sve-jos-postprint.pdf
Mida:: 777,4Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Articles de revista [1.049]
Articles de revista [382]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Using Arm’s scalable vector extension on stencil codes

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora