Mostra el registre d'ítem simple

dc.contributor.authorHusain, Syed Farzad
dc.contributor.authorDellen, Babette
dc.contributor.authorTorras, Carme
dc.contributor.otherInstitut de Robòtica i Informàtica Industrial
dc.date.accessioned2017-04-21T13:12:57Z
dc.date.available2017-04-21T13:12:57Z
dc.date.issued2016
dc.identifier.citationHusain, S., Dellen, B., Torras, C. Action recognition based on efficient deep feature learning in the spatio-temporal domain. "IEEE robotics and automation letters", 2016, vol. 1, núm. 2, p. 984-991.
dc.identifier.issn2377-3766
dc.identifier.urihttp://hdl.handle.net/2117/103626
dc.description© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.description.abstractHand-crafted feature functions are usually designed based on the domain knowledge of a presumably controlled environment and often fail to generalize, as the statistics of real-world data cannot always be modeled correctly. Data-driven feature learning methods, on the other hand, have emerged as an alternative that often generalize better in uncontrolled environments. We present a simple, yet robust, 2D convolutional neural network extended to a concatenated 3D network that learns to extract features from the spatio-temporal domain of raw video data. The resulting network model is used for content-based recognition of videos. Relying on a 2D convolutional neural network allows us to exploit a pretrained network as a descriptor that yielded the best results on the largest and challenging ILSVRC-2014 dataset. Experimental results on commonly used benchmarking video datasets demonstrate that our results are state-of-the-art in terms of accuracy and computational time without requiring any preprocessing (e.g., optic flow) or a priori knowledge on data capture (e.g., camera motion estimation), which makes it more general and flexible than other approaches. Our implementation is made available.
dc.format.extent8 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Automàtica i control
dc.subject.otherComputer vision for automation
dc.subject.otherrecognition
dc.subject.othervisual learning
dc.subject.otherartificial intelligence
dc.subject.othercomputer vision
dc.subject.otherpattern classification
dc.titleAction recognition based on efficient deep feature learning in the spatio-temporal domain
dc.typeArticle
dc.contributor.groupUniversitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
dc.identifier.doi10.1109/LRA.2016.2529686
dc.description.peerreviewedPeer Reviewed
dc.subject.inspecClassificació INSPEC::Pattern recognition::Computer vision
dc.subject.inspecClassificació INSPEC::Pattern recognition
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/7406684/
dc.rights.accessOpen Access
local.identifier.drac19160450
dc.description.versionPostprint (author's final draft)
local.citation.authorHusain, S.; Dellen, B.; Torras, C.
local.citation.publicationNameIEEE robotics and automation letters
local.citation.volume1
local.citation.number2
local.citation.startingPage984
local.citation.endingPage991


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple