AMA: asynchronous management of accelerators for task-based programming models
Visualitza/Obre
10.1016/j.procs.2015.05.212
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/104266
Tipus de documentText en actes de congrés
Data publicació2015
EditorElsevier
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set of optimization techniques to efficiently manage multi-accelerator systems. AMA maximizes the overlap of computation and communication in a blocking-free way. Then, we can use such spare time to do other work while waiting for device operations. Implemented on top of a task-based framework, the experimental evaluation of AMA on a quad-GPU node shows that we reach the performance of a hand-tuned native CUDA code, with the advantage of fully hiding the device management. In addition, we obtain up to more than 2x performance speed-up with respect to the original framework implementation.
CitacióPlanas, J., Badia, R.M., Ayguadé, E., Labarta, J. AMA: asynchronous management of accelerators for task-based programming models. A: International Conference on Computational Science. "Procedia Computer Science (Vol. 51, 2015)". Reykjavík: Elsevier, 2015, p. 130-139.
ISBN1877-0509
Versió de l'editorhttp://www.sciencedirect.com/science/article/pii/S1877050915010200
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
1-s2.0-S1877050915010200-main.pdf | 1,275Mb | Visualitza/Obre |