Show simple item record

dc.contributorGiró Nieto, Xavier
dc.contributorMcGuinness, Kevin
dc.contributor.authorArazo Sánchez, Eric
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.description.abstractThis thesis introduces an architecture to improve the accuracy of a Convolutional Neural Network trained for image classification using visual saliency predictions from the original images. In this thesis the accuracy of a Convolutional Neural Network (CNN) trained for classification has been improved using saliency maps from the original images. The network had an AlexNet architecture and was trained using 1.2 million images from the Imagenet dataset. Two methods had been explored in order to exploit the information from the visual saliency predictions. The first methodologies implemented applied the saliency maps directly to the existing layers of the CNN, which in some cases were already trained for classification and in other they were initialized with random weights. In the second methodology the information from the saliency maps was merged from a new branch, trained at the same time as the initial CNN. In order to speed up the training of the networks the experiments were implemented using images reduced to 128x128. With this sizes the proposed model achieves 12.39% increase in Top-1 accuracy performance with respect to the original CNN, and additionally reduces the number of parameters needed compared to AlexNet. Regarding the original size images 227x227 a model that increases 1.72% Top-1 accuracy is proposed. To accelerate the training process of the network the images have been reduced. The methodology that provides the higher improvement in accuracy will be implemented using the original size of the images. The results will be compared to those obtained from the network trained only with the original images. All the methodologies proposed are implemented in a network previously trained for classification. Additionally the most successful methodologies will be implemented in the training of a network. The results will provide information about the best way to add saliency maps to improve the accuracy.
dc.publisherUniversitat Politècnica de Catalunya
dc.rightsS'autoritza la difusió de l'obra mitjançant la llicència Creative Commons o similar 'Reconeixement-NoComercial- SenseObraDerivada'
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshMachine learning
dc.subject.otherconvolutional neural network
dc.subject.otherdeep learning
dc.titleThe impact of visual saliency prediction in image classification
dc.typeMaster thesis
dc.subject.lemacXarxes neuronals (Informàtica)
dc.subject.lemacAprenentatge automàtic
dc.rights.accessOpen Access
dc.audience.mediatorEscola Tècnica Superior d'Enginyeria de Telecomunicació de Barcelona
dc.contributor.covenanteeDublin City University

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Spain
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 3.0 Spain