Mostra el registre d'ítem simple

dc.contributor.authorSalvador Aguilera, Amaia
dc.contributor.authorManchon Vizuete, Daniel
dc.contributor.authorCalafell, Andrea
dc.contributor.authorGiró Nieto, Xavier
dc.contributor.authorZeppelzauer, Matthias
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2016-01-25T11:50:58Z
dc.date.available2016-01-25T11:50:58Z
dc.date.issued2015
dc.identifier.citationSalvador, A., Manchon, D., Calafell, A., Giro, X., Zeppelzauer, M. Cultural event recognition with visual ConvNets and temporal models. A: Challenge and Workshop on Pose Recovery, Action Recognition, and Cultural Event Recognition. "Proceedings of ChaLearn 2015: Challenge and Workshop on Pose Recovery, Action Recognition, and Cultural Event Recognition". Boston: 2015, p. 36-44.
dc.identifier.urihttp://hdl.handle.net/2117/81947
dc.description.abstractThis paper presents our contribution to the ChaLearn Challenge 2015 on Cultural Event Classification. The challenge in this task is to automatically classify images from 50 different cultural events. Our solution is based on the combination of visual features extracted from convolutional neural networks with temporal information using a hierarchical classifier scheme. We extract visual features from the last three fully connected layers of both CaffeNet (pretrained with ImageNet) and our fine tuned version for the ChaLearn challenge. We propose a late fusion strategy that trains a separate low-level SVM on each of the extracted neural codes. The class predictions of the low-level SVMs form the input to a higher level SVM, which gives the final event scores. We achieve our best result by adding a temporal refinement step into our classification scheme, which is applied directly to the output of each low-level SVM. Our approach penalizes high classification scores based on visual features when their time stamp does not match well an event-specific temporal distribution learned from the training and validation data. Our system achieved the second best result in the ChaLearn Challenge 2015 on Cultural Event Classification with a mean average precision of 0.767 on the test set.
dc.format.extent9 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::So, imatge i multimèdia::Creació multimèdia::Imatge digital
dc.subject.lcshImage analysis
dc.subject.lcshNeural networks (Computer science)
dc.titleCultural event recognition with visual ConvNets and temporal models
dc.typeConference lecture
dc.subject.lemacImatges -- Classificació
dc.subject.lemacXarxes neuronals (Informàtica)
dc.contributor.groupUniversitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://www.cv-foundation.org/openaccess/CVPR2015_workshops/menu.py
dc.rights.accessOpen Access
local.identifier.drac16269394
dc.description.versionPostprint (published version)
local.citation.authorSalvador, A.; Manchon, D.; Calafell, A.; Giro, X.; Zeppelzauer, M.
local.citation.contributorChallenge and Workshop on Pose Recovery, Action Recognition, and Cultural Event Recognition
local.citation.pubplaceBoston
local.citation.publicationNameProceedings of ChaLearn 2015: Challenge and Workshop on Pose Recovery, Action Recognition, and Cultural Event Recognition
local.citation.startingPage36
local.citation.endingPage44


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple