Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster
10.1016/j.procs.2017.05.074
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/107590
Tipus de documentArticle
Data publicació2017
EditorElsevier
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. In this work, we explore how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied.
CitacióCampos, V., Sastre, F., Yagües, M., Bellver, M., Giro, X., Torres, J. Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster. "Procedia computer science", 2017, vol. 108, p. 315-324.
ISSN1877-0509
Versió de l'editorhttp://www.sciencedirect.com/science/article/pii/S1877050917306129
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
Distributed tra ... a computer vision deep.pdf | 536,5Kb | Visualitza/Obre |