Extending OmpSs for OpenCL kernel co-execution in heterogeneous systems
Visualitza/Obre
Cita com:
hdl:2117/115527
Tipus de documentText en actes de congrés
Data publicació2017
EditorInstitute of Electrical and Electronics Engineers (IEEE)
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
ProjecteCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
WHEY2VALUE - Whey2Value: Valorising waste whey into high-value products (EC-H2020-697958)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
WHEY2VALUE - Whey2Value: Valorising waste whey into high-value products (EC-H2020-697958)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
Abstract
Heterogeneous systems have a very high potential performance but present difficulties in their programming. OmpSs is a well known framework for task based parallel applications, which is an interesting tool to simplify the programming of these systems. However, it does not support the co-execution of a single OpenCL kernel instance on several compute devices. To overcome this limitation, this paper presents an extension of the OmpSs framework that solves two main objectives: the automatic division of datasets among several devices and the management of their memory address spaces. To adapt to different kinds of applications, the data division can be performed by the novel HGuided load balancing algorithm or by the well known Static and Dynamic. All this is accomplished with negligible impact on the programming. Experimental results reveal that there is always one load balancing algorithm that improves the performance and energy consumption of the system.
Descripció
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
CitacióPérez, B., Stafford, E., Bosque, J., Beivide, R., Mateo, S., Teruel, X., Martorell, X., Ayguade, E. Extending OmpSs for OpenCL kernel co-execution in heterogeneous systems. A: International Symposium on Computer Architecture and High Performance Computing. "29th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2017: 17-20 October 2017, Campinas, SP, Brazil: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 1-8.
ISBN978-1-5090-1233-6
Versió de l'editorhttp://ieeexplore.ieee.org/document/8102171/
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
Extending OmpSs for OpenCL kernel co-execution.pdf | 277,1Kb | Visualitza/Obre |