HPC benchmarking: scaling right and looking beyond the average
Document typeConference report
Rights accessOpen Access
European Commission's projectExaNoDe - European Exascale Processor Memory Node Design (EC-H2020-671578)
Designing a balanced HPC system requires an understanding of the dominant performance bottlenecks. There is as yet no well established methodology for a unified evaluation of HPC systems and workloads that quantifies the main performance bottlenecks. In this paper, we execute seven production HPC applications on a production HPC platform, and analyse the key performance bottlenecks: FLOPS performance and memory bandwidth congestion, and the implications on scaling out. We show that the results depend significantly on the number of execution processes and granularity of measurements. We therefore advocate for guidance in the application suites, on selecting the representative scale of the experiments. Also, we propose that the FLOPS performance and memory bandwidth should be represented in terms of the proportions of time with low, moderate and severe utilization. We show that this gives much more precise and actionable evidence than the average.
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10
CitationRadulovic, M., Asifuzzaman, K., Carpenter, P., Radojkovic, P., Ayguade, E. HPC benchmarking: scaling right and looking beyond the average. A: International European Conference on Parallel and Distributed Computing. "Euro-Par 2018: Parallel Processing 24th International Conference on Parallel and Distributed Computing: Turin, Italy: August 27-31, 2018: proceedings". Berlín: Springer, 2018, p. 135-146.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder