Communication-aware sparse patterns for the factorized approximate inverse preconditioner
Visualitza/Obre
Cita com:
hdl:2117/369464
Tipus de documentText en actes de congrés
Data publicació2022
EditorAssociation for Computing Machinery (ACM)
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
ProjecteDEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (EC-H2020-955606)
EXCELLERAT - The European Centre of Excellence for Engineering Applications (EC-H2020-823691)
MODELOS NUMERICOS Y ALGORITMOS PARA LA SIMULACION DE ALTAS PRESTACIONES EN ANALISIS ESTRUCTURAL (AEI-PID2020-117001GB-I00)
EXCELLERAT - The European Centre of Excellence for Engineering Applications (EC-H2020-823691)
MODELOS NUMERICOS Y ALGORITMOS PARA LA SIMULACION DE ALTAS PRESTACIONES EN ANALISIS ESTRUCTURAL (AEI-PID2020-117001GB-I00)
Abstract
The Conjugate Gradient (CG) method is an iterative solver targeting linear systems of equations Ax=b where A is a symmetric and positive definite matrix. CG convergence properties improve when preconditioning is applied to reduce the condition number of matrix A. While many different options can be found in the literature, the Factorized Sparse Approximate Inverse (FSAI) preconditioner constitutes a highly parallel option based on approximating A-1. This paper proposes the Communication-aware Factorized Sparse Approximate Inverse preconditioner (FSAIE-Comm), a method to generate extensions of the FSAI sparse pattern that are not only cache friendly, but also avoid increasing communication costs in distributed memory systems. We also propose a filtering strategy to reduce inter-process imbalance. We evaluate FSAIE-Comm on a heterogeneous set of 39 matrices achieving an average solution time decrease of 17.98%, 26.44% and 16.74% on three different architectures, respectively, Intel Skylake, Fujitsu A64FX and AMD Zen 2 with respect to FSAI. In addition, we consider a set of 8 large matrices running on up to 32,768 CPU cores, and we achieve an average solution time decrease of 12.59%.
CitacióLaut, S.; Casas, M.; Borrell, R. Communication-aware sparse patterns for the factorized approximate inverse preconditioner. A: ACM International Symposium on High-Performance Parallel and Distributed Computing. "HPDC '22: proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing: June 27–30, 2022, Minneapolis, MN, USA". New York: Association for Computing Machinery (ACM), 2022, p. 148-158. ISBN 978-1-4503-9199-3. DOI 10.1145/3502181.3531472.
ISBN978-1-4503-9199-3
Versió de l'editorhttps://dl.acm.org/doi/abs/10.1145/3502181.3531472
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
3502181.3531472.pdf | 1,435Mb | Visualitza/Obre |