Mostra el registre d'ítem simple

dc.contributorTorres Viñals, Jordi
dc.contributor.authorSastre Cabot, Francesc
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2017-07-14T07:02:38Z
dc.date.available2017-07-14T07:02:38Z
dc.date.issued2017-04-28
dc.identifier.urihttp://hdl.handle.net/2117/106390
dc.description.abstractDeep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. This project show how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster, Minotauro GPU cluster from Barcelona Supercomputing Center with the TensorFlow framework. In this project, two approaches for distributed training are used, the synchronous and the mixed-asynchronous. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied. The results show an improvement for both focused areas. On one hand, the experiments show promising results in order to train a neural network faster. The training time is decreased from 106 hours to 16 hours in mixedasynchronous and 12 hours in synchronous. On the other hand we can observe how increasing the numbers of GPUs in one node rises the throughput, images per second, in a near-linear way. Moreover the accuracy can be maintained, like the one node training, in the synchronous methods.
dc.language.isoeng
dc.publisherUniversitat Politècnica de Catalunya
dc.subjectÀrees temàtiques de la UPC::Informàtica
dc.subject.lcshComputational grids (Computer systems)
dc.subject.lcshMachine learning
dc.subject.lcshNeural networks (Computer science)
dc.subject.otherSistemes paral·lels
dc.subject.otherDeep Learning
dc.subject.otherXarxes neuronals convolucionals
dc.subject.otherVisió per computador
dc.subject.otherUnitat de processament gràfic
dc.subject.otherTensorFlow
dc.subject.otherComputadores d'altes prestacions
dc.subject.otherMinotauro
dc.subject.otherBSC
dc.subject.otherParallel Systems
dc.subject.otherConvolutional Neural Networks
dc.subject.otherComputer Vision
dc.subject.otherGraphic Processing Unit
dc.subject.otherHigh Performance Computers
dc.titleScalability study of Deep Learning algorithms in high performance computer infrastructures
dc.title.alternativeEstudi d'escalabilitat d'algorismes Deep Learning a infraestructures de computació d'altes prestacions
dc.typeMaster thesis
dc.subject.lemacComputació distribuïda
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacXarxes neuronals (Informàtica)
dc.identifier.slug122772
dc.rights.accessOpen Access
dc.date.updated2017-05-17T04:00:10Z
dc.audience.educationlevelMàster
dc.audience.mediatorFacultat d'Informàtica de Barcelona
dc.audience.degreeMÀSTER UNIVERSITARI EN ENGINYERIA INFORMÀTICA (Pla 2012)
dc.contributor.covenanteeBarcelona Supercomputing Center


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple