Benefits of SMT and of Parallel Transpose Algorithm for the Large-Scale GYSELA Application
Tipus de documentText en actes de congrés
EditorAssociation for Computing Machinery
Condicions d'accésAccés obert
Projecte de la Comissió EuropeaPOP - Performance Optimisation and Productivity (EC-H2020-676553)
EoCoE - Energy oriented Centre of Excellence for computer applications (EC-H2020-676629)
This article describes how we manage to increase performance and to extend features of a large parallel application through the use of simultaneous multithreading (SMT) and by designing a robust parallel transpose algorithm. The semi-Lagrangian code Gysela typically performs large physics simulations using a few thousands of cores, between 1k cores up to 16k on x86-based clusters. However, simulations with finer resolutions and with kinetic electrons increase those needs by a huge factor, providing a good example of applications requiring Exascale machines. To improve Gysela compute times, we take advantage of efficient SMT implementations available on recent INTEL architectures. We also analyze the cost of a transposition communication scheme that involves a large number of cores in our case. Adaptation of the code for balance load whenever using both SMT and good deployment strategy led to a significant reduction that can be up to 38% of the execution times.
CitacióLatu, G. [et al.]. Benefits of SMT and of Parallel Transpose Algorithm for the Large-Scale GYSELA Application. A: "PASC '16 Proceedings of the Platform for Advanced Scientific Computing Conference". Association for Computing Machinery, 2016.
Versió de l'editorhttp://dl.acm.org/citation.cfm?id=2929912