To distribute or not to distribute: The question of load balancing for performance or energy
Document typeConference report
Rights accessOpen Access
European Commisision's projectMont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platformbased on low-power embedded technology (EC-H2020-671697)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
Heterogeneous systems are nowadays a common choice in the path to Exascale. Through the use of accelerators they offer outstanding energy efficiency. The programming of these devices employs the host-device model, which is suboptimal as CPU remains idle during kernel executions, but still consumes energy. Making the CPU contribute computin effort might improve the performance and energy consumption of the system. This paper analyses the advantages of this approach and sets the limits of when its beneficial. The claims are supported by a set of models that determine how to share a single data-parallel task between the CPU and the accelerator for optimum performance, energy consumption or efficiency. Interestingly, the models show that optimising performance does not always mean optimum energy or efficiency as well. The paper experimentally validates the models, which represent an invaluable tool for programmers when faced with the dilemma of whether to distribute their workload in these systems.
CitationStafford, E., Pérez, B., Bosque, J., Beivide, R., Valero, M. To distribute or not to distribute: The question of load balancing for performance or energy. A: International European Conference on Parallel and Distributed Computing. "Euro-Par 2017: Parallel Processing: 23rd International Conference on Parallel and Distributed Computing: Santiago de Compostela, Spain, August 28–September 1, 2017: proceedings". Santiago de Compostela: Springer, 2017, p. 710-722.