Disentangling motion, foreground and background features in videos

Lin, Xunyu

dc.contributor	Torres Viñals, Jordi
dc.contributor.author	Lin, Xunyu
dc.date.accessioned	2018-01-26T09:00:21Z
dc.date.available	2018-01-26T09:00:21Z
dc.date.issued	2017
dc.identifier.uri	http://hdl.handle.net/2117/113234
dc.description.abstract	This paper instroduces an unsupervised framework to extract semantically rich features for video representation. Inspired by how the human visual system groups objects based on motion cues, we propose a deep convolutional neural network that disentangles motion, foreground and background information. The proposed architecture consists of a 3D convolutional feature encoder for blocks of 16 frames, which is trained for reconstruction tasks over the first and last frames of the sequence. The model is trained with a fraction of videos from the UCF-101 dataset taking as ground truth the bounding boxes around the activity regions. Qualitative results indicate that the network can successfully update the foreground appearance based on pure-motion features. The benefits of these learned features are shown in a discriminative classification task when compared with a random initialization of the network weights, providing a gain of accuracy above the 10%.
dc.language.iso	eng
dc.publisher	Universitat Politècnica de Catalunya
dc.subject	Àrees temàtiques de la UPC::Informàtica
dc.subject.lcsh	Artificial intelligence
dc.subject.lcsh	Video recording
dc.subject.other	Unsupervised learning
dc.subject.other	artificial intelligence
dc.subject.other	video features
dc.subject.other	action recognition
dc.title	Disentangling motion, foreground and background features in videos
dc.title.alternative	Unsupervised video representations learning for activity recognition
dc.type	Bachelor thesis
dc.subject.lemac	Intel·ligència artificial
dc.subject.lemac	Vídeo
dc.identifier.slug	128563
dc.rights.access	Open Access
dc.date.updated	2017-06-30T14:11:12Z
dc.audience.educationlevel	Grau
dc.audience.mediator	Facultat d'Informàtica de Barcelona
dc.audience.degree	GRAU EN ENGINYERIA INFORMÀTICA (Pla 2010)

Fitxers d'aquest items

Nom:: 128563.pdf
Mida:: 296,7Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Grau en Enginyeria Informàtica (Pla 2010) [2.482]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Disentangling motion, foreground and background features in videos

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora