Memory demands in disaggregated HPC: How accurate do we need to be?
Visualitza/Obre
10.1109/PMBS54543.2021.00006
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/364331
Tipus de documentText en actes de congrés
Data publicació2021
EditorInstitute of Electrical and Electronics Engineers (IEEE)
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
ProjecteEuroEXA - Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EC-H2020-754337)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
Abstract
Disaggregated memory has recently been proposed as a way to allow flexible and fine-grained allocation of memory capacity, mitigating the mismatch between fixed per-node resource provisioning and the needs of the submitted jobs. By allowing the sharing of memory capacity among cluster nodes, overall HPC system throughput can be improved, due to the reduction of stranded and underutilized resources. A key parameter that is generally expected to be provided by the user at submission time is the job's memory capacity demand. It is unrealistic to expect this number to be precise. This paper makes an important step towards understanding the effect of overestimating the job memory requirements. We analyse the implications on overall system throughput and job response time. We leverage a disaggregated simulation infrastructure implemented on the popular Slurm resource manager. Our results show that even when the cost of a 60% increase in memory demands only increases a single job's user response time by 8%, the aggregate result of everybody doing so can be a 25% reduction in throughput and a 5 times increase in response time. These results show that GB-hours should be explicitly allocated in addition to core-hours.
CitacióVieira, F.; Carpenter, P.; Petrucci, V. Memory demands in disaggregated HPC: How accurate do we need to be? A: International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. "Proceedings of PMBS 2021: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems: held in conjunction with SC21: The International Conference for High Performance Computing, Networking, Storage and Analysis: St. Louis, Missouri, USA, November 14-19, 2021". Institute of Electrical and Electronics Engineers (IEEE), p. 1-6. ISBN 978-1-6654-1118-9. DOI 10.1109/PMBS54543.2021.00006.
ISBN978-1-6654-1118-9
Versió de l'editorhttps://ieeexplore.ieee.org/document/9652672
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
PMBS_2021_short_paper_Camera_Ready.pdf | 360,0Kb | Visualitza/Obre |