Advanced Performance Analysis of HPC Workloads on Cavium ThunderX
Cita com:
hdl:2117/107063
Document typeConference lecture
Defense date2018
PublisherIEEE
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectMONT-BLANC - Mont-Blanc, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-288777)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
Abstract
The interest towards Arm based platforms as HPC solutions increased significantly during the last 5 years. In this paper we show that, in contrast to the early days of pioneer tests, several application performance analysis techniques can now be applied also to Arm based SoCs. To show the possibilities offered by the available tools, we provide as an example, the analysis of a Lattice Boltzmann HPC production code, highly optimized for several architectures and now ported also to Armv8. We tested it on a system based on a production silicon, Cavium CN8890 SoC. In particular, as performance analysis tools we adopt Extrae and Paraver, making use of the PAPI support, initially developed by us for the ThunderX platform, and now available also upstream. The contribution of this paper is twofold: first, we demonstrate that performance analysis tools available on standard HPC platforms, independently from the CPU providers, are nowadays available also for Arm SoCs; second, we actually optimize an HPC application for this platforms, showing similarities with other architectures.
CitationE. Calore, F. Mantovani and D. Ruiz, "Advanced Performance Analysis of HPC Workloads on Cavium ThunderX," 2018 International Conference on High Performance Computing & Simulation (HPCS), Orleans, France, 2018, pp. 375-382.
doi: 10.1109/HPCS.2018.00068
ISBN978-1-5386-7879-4
Publisher versionhttps://ieeexplore.ieee.org/document/8514373
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
Advanced performance analysis of HPC workloads.pdf | 1,215Mb | View/Open |