Communication-aware sparse patterns for the factorized approximate inverse preconditioner

Laut Turón, Sergi; Casas, Marc; Borrell Pol, Ricard

doi:10.1145/3502181.3531472

Visualitza/Obre

3502181.3531472.pdf (1,435Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Laut Turón, Sergi

Casas, Marc

Borrell Pol, Ricard

Tipus de documentText en actes de congrés

Data publicació2022

EditorAssociation for Computing Machinery (ACM)

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

ProjecteDEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (EC-H2020-955606)
EXCELLERAT - The European Centre of Excellence for Engineering Applications (EC-H2020-823691)
MODELOS NUMERICOS Y ALGORITMOS PARA LA SIMULACION DE ALTAS PRESTACIONES EN ANALISIS ESTRUCTURAL (AEI-PID2020-117001GB-I00)

Abstract

The Conjugate Gradient (CG) method is an iterative solver targeting linear systems of equations Ax=b where A is a symmetric and positive definite matrix. CG convergence properties improve when preconditioning is applied to reduce the condition number of matrix A. While many different options can be found in the literature, the Factorized Sparse Approximate Inverse (FSAI) preconditioner constitutes a highly parallel option based on approximating A-1. This paper proposes the Communication-aware Factorized Sparse Approximate Inverse preconditioner (FSAIE-Comm), a method to generate extensions of the FSAI sparse pattern that are not only cache friendly, but also avoid increasing communication costs in distributed memory systems. We also propose a filtering strategy to reduce inter-process imbalance. We evaluate FSAIE-Comm on a heterogeneous set of 39 matrices achieving an average solution time decrease of 17.98%, 26.44% and 16.74% on three different architectures, respectively, Intel Skylake, Fujitsu A64FX and AMD Zen 2 with respect to FSAI. In addition, we consider a set of 8 large matrices running on up to 32,768 CPU cores, and we achieve an average solution time decrease of 12.59%.

CitacióLaut, S.; Casas, M.; Borrell, R. Communication-aware sparse patterns for the factorized approximate inverse preconditioner. A: ACM International Symposium on High-Performance Parallel and Distributed Computing. "HPDC '22: proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing: June 27–30, 2022, Minneapolis, MN, USA". New York: Association for Computing Machinery (ACM), 2022, p. 148-158. ISBN 978-1-4503-9199-3. DOI 10.1145/3502181.3531472.

URIhttp://hdl.handle.net/2117/369464

DOI10.1145/3502181.3531472

ISBN978-1-4503-9199-3

Versió de l'editorhttps://dl.acm.org/doi/abs/10.1145/3502181.3531472

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
3502181.3531472.pdf		1,435Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Communication-aware sparse patterns for the factorized approximate inverse preconditioner

Visualitza/Obre

Explora