DROM: Enabling Efficient and Effortless Malleability for Resource Managers
Document typeConference lecture
PublisherAssociation for Computing Machinery (ACM)
Rights accessOpen Access
European Commisision's projectHBP SGA2 - Human Brain Project Specific Grant Agreement 2 (EC-H2020-785907)
In the design of future HPC systems, research in resource management is showing an increasing interest in a more dynamic control of the available resources. It has been proven that enabling the jobs to change the number of computing resources at run time, i.e. their malleability, can significantly improve HPC system performance. However, job schedulers and applications typically do not support malleability due to the common belief that it introduces additional programming complexity and performance impact. This paper presents DROM, an interface that provides efficient malleability with no effort for program developers. The running application is enabled to adapt the number of threads to the number of assigned computing resources in a completely transparent way to the user through the integration of DROM with standard programming models, such as OpenMP/OmpSs, and MPI. We designed the APIs to be easily used by any programming model, application and job scheduler or resource manager. Our experimental results from two realistic use cases analysis, based on malleability by reducing the number of cores a job is using per node and jobs co-allocation, show the potential of DROM for improving the performance of HPC systems. In particular, the workload of two MPI+OpenMP neuro-simulators are tested, reporting improvement in system metrics, such as total run time and average response time, up to 8% and 48%, respectively.
CitationD'Amico, M. [et al.]. DROM: Enabling Efficient and Effortless Malleability for Resource Managers. A: "ICPP '18 Proceedings of the 47th International Conference on Parallel Processing Companion". Association for Computing Machinery (ACM), 2018.