Show simple item record

dc.contributor.authorXu, Zhengyu
dc.contributor.authorVilaplana Besler, Verónica
dc.contributor.authorMorros Rubió, Josep Ramon
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2019-02-04T07:48:06Z
dc.date.available2019-02-04T07:48:06Z
dc.date.issued2018
dc.identifier.citationXu, Z.; Vilaplana, V.; Morros, J.R. Action tube extraction based 3D-CNN for RGB-D action recognition. A: International Workshop on Content-Based Multimedia Indexing. "16th International Conference on Content-Based Multimedia Indexing: 4-6 September, 2018 La Rochelle, France". Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1-6.
dc.identifier.isbn978-1-5386-7021-7
dc.identifier.urihttp://hdl.handle.net/2117/128191
dc.description.abstractIn this paper we propose a novel action tube extractor for RGB-D action recognition in trimmed videos. The action tube extractor takes as input a video and outputs an action tube. The method consists of two parts: spatial tube extraction and temporal sampling. The first part is built upon MobileNet-SSD and its role is to define the spatial region where the action takes place. The second part is based on the structural similarity index (SSIM) and is designed to remove frames without obvious motion from the primary action tube. The final extracted action tube has two benefits: 1) a higher ratio of ROI (subjects of action) to background; 2) most frames contain obvious motion change. We propose to use a two-stream (RGB and Depth) I3D architecture as our 3D-CNN model. Our approach outperforms the state-of-the-art methods on the OA and NTU RGB-D datasets. © 2018 IEEE.
dc.format.extent6 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la imatge i del senyal vídeo
dc.subjectÀrees temàtiques de la UPC::So, imatge i multimèdia::Creació multimèdia::Vídeo digital
dc.subject.lcshDigital video
dc.subject.lcsh3-D video (Three-dimensional imaging)
dc.subject.lcshImage processing--Digital techniques
dc.subject.other3D-CNN
dc.subject.otheraction recognition
dc.subject.otheraction tube extraction extraction
dc.subject.otherindexing (of information)
dc.subject.other3D-CNN
dc.subject.otheraction recognition
dc.subject.otherCNN models
dc.subject.otherspatial regions
dc.subject.otherstate-of-the-art methods
dc.subject.otherstructural similarity indices (SSIM)
dc.subject.othertemporal sampling
dc.subject.othertwo-stream
dc.subject.othertubes (components)
dc.titleAction tube extraction based 3D-CNN for RGB-D action recognition
dc.typeConference report
dc.subject.lemacVídeo digital
dc.subject.lemacVisualització tridimensional (Informàtica)
dc.subject.lemacImatges -- Processament -- Tècniques digitals
dc.contributor.groupUniversitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
dc.identifier.doi10.1109/CBMI.2018.8516450
dc.description.peerreviewedPeer Reviewed
dc.rights.accessOpen Access
local.identifier.drac23551422
dc.description.versionPostprint (published version)
local.citation.authorXu, Z.; Vilaplana, V.; Morros, J.R.
local.citation.contributorInternational Workshop on Content-Based Multimedia Indexing
local.citation.publicationName16th International Conference on Content-Based Multimedia Indexing: 4-6 September, 2018 La Rochelle, France
local.citation.startingPage1
local.citation.endingPage6


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record