Scalability study of Deep Learning algorithms in high performance computer infrastructures

Sastre Cabot, Francesc

dc.contributor	Torres Viñals, Jordi
dc.contributor.author	Sastre Cabot, Francesc
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2017-07-14T07:02:38Z
dc.date.available	2017-07-14T07:02:38Z
dc.date.issued	2017-04-28
dc.identifier.uri	http://hdl.handle.net/2117/106390
dc.description.abstract	Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. This project show how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster, Minotauro GPU cluster from Barcelona Supercomputing Center with the TensorFlow framework. In this project, two approaches for distributed training are used, the synchronous and the mixed-asynchronous. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied. The results show an improvement for both focused areas. On one hand, the experiments show promising results in order to train a neural network faster. The training time is decreased from 106 hours to 16 hours in mixedasynchronous and 12 hours in synchronous. On the other hand we can observe how increasing the numbers of GPUs in one node rises the throughput, images per second, in a near-linear way. Moreover the accuracy can be maintained, like the one node training, in the synchronous methods.
dc.language.iso	eng
dc.publisher	Universitat Politècnica de Catalunya
dc.subject	Àrees temàtiques de la UPC::Informàtica
dc.subject.lcsh	Computational grids (Computer systems)
dc.subject.lcsh	Machine learning
dc.subject.lcsh	Neural networks (Computer science)
dc.subject.other	Sistemes paral·lels
dc.subject.other	Deep Learning
dc.subject.other	Xarxes neuronals convolucionals
dc.subject.other	Visió per computador
dc.subject.other	Unitat de processament gràfic
dc.subject.other	TensorFlow
dc.subject.other	Computadores d'altes prestacions
dc.subject.other	Minotauro
dc.subject.other	BSC
dc.subject.other	Parallel Systems
dc.subject.other	Convolutional Neural Networks
dc.subject.other	Computer Vision
dc.subject.other	Graphic Processing Unit
dc.subject.other	High Performance Computers
dc.title	Scalability study of Deep Learning algorithms in high performance computer infrastructures
dc.title.alternative	Estudi d'escalabilitat d'algorismes Deep Learning a infraestructures de computació d'altes prestacions
dc.type	Master thesis
dc.subject.lemac	Computació distribuïda
dc.subject.lemac	Aprenentatge automàtic
dc.subject.lemac	Xarxes neuronals (Informàtica)
dc.identifier.slug	122772
dc.rights.access	Open Access
dc.date.updated	2017-05-17T04:00:10Z
dc.audience.educationlevel	Màster
dc.audience.mediator	Facultat d'Informàtica de Barcelona
dc.audience.degree	MÀSTER UNIVERSITARI EN ENGINYERIA INFORMÀTICA (Pla 2012)
dc.contributor.covenantee	Barcelona Supercomputing Center

Fitxers d'aquest items

Nom:: 122772.pdf
Mida:: 8,376Mb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Màster universitari en Enginyeria Informàtica - MEI [132]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Scalability study of Deep Learning algorithms in high performance computer infrastructures

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora