Image recognition with Deep Learning techniques and TensorFlow
Realitzat a/ambBarcelona Supercomputing Centre (BSC)
Tipus de documentProjecte Final de Màster Oficial
Condicions d'accésAccés obert
Deep neural networks have gained popularity in recent years, obtaining outstanding results in a wide range of application, but most notoriously in computer vision and natural language processing tasks. Despite the newly found interest, research in neural networks span many decades back, and some of today’s most used network architectures where invented many years ago. Nevertheless, the progress made during this period cannot be understood without taking into account the technological advancements seen in key contiguous domains such as massive data storage and computing systems, more specifically in the Graphic Processing Unit (GPU) domain. These two components are responsible for the enormous performance gains in neural networks, that have made what we call Deep Learning a common word among the Artificial Intelligence and Machine Learning community. These kind of networks need massive amounts of data to effectively train the millions of parameters they contain, and this training can take up to days or weeks depending on the computer architecture we are using. The size of new published datasets keeps growing, and the tendency of creating deeper networks that outperforms shallower architectures means that on the medium and long term the computer hardware to undertake these kind of training processes can only be found in high performance computing facilities, where they have enormous clusters of computers. However, using these machines is not straightforward, as both the framework and the code need to be appropriately tuned for effectively taking advantage of these distributed environments. For this reason, we test TensorFlow, an open-sourced framework for Deep Learning from Google that has built-in distributed support, on top of the GPU cluster, called MinoTauro, at Barcelona Supercomputing Center (BSC). We aim to implement a defined workload using the distributed features the framework offers, to speed up the training process, acquire knowledge of the inner workings of the framework and understand the similarities and differences with respect to a classic single node training.