Show simple item record

dc.contributor.authorButko, Taras
dc.contributor.authorCanton Ferrer, Cristian
dc.contributor.authorSegura, C.
dc.contributor.authorGiró Nieto, Xavier
dc.contributor.authorNadeu Camprubí, Climent
dc.contributor.authorHernando Pericás, Francisco Javier
dc.contributor.authorCasas Pla, Josep Ramon
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2016-04-07T10:24:50Z
dc.date.available2016-04-07T10:24:50Z
dc.date.issued2009
dc.identifier.citationButko, T., Canton, C., Segura, C., Giro, X., Nadeu, C., Hernando, J., Casas, J. Improving detection of acoustic events using audiovisual data and feature level fusion. A: Annual Conference of the International Speech Communication Association. "ISCA-INST Speech Communication Association". 2009, p. 1147-1150.
dc.identifier.isbn978-1-61567-692-7
dc.identifier.urihttp://hdl.handle.net/2117/85340
dc.description.abstractThe detection of the acoustic events (AEs) that are naturally produced in a meeting room may help to describe the human and social activity that takes place in it. When applied to spontaneous recordings, the detection of AEs from only audio information shows a large amount of errors, which are mostly due to temporal overlapping of sounds. In this paper, a system to detect and recognize AEs using both audio and video information is presented. A feature-level fusion strategy is used, and the structure of the HMM-GMM based system considers each class separately and uses a one-against-all strategy for training. Experime ntal AED results with a new and rather spontaneous dataset are presented which show the advantage of the proposed approach.
dc.format.extent4 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcshSpeech processing systems
dc.subject.lcshDigital video
dc.subject.otheracoustic event detection
dc.subject.othermultimodality
dc.subject.othermultimodal fusion
dc.subject.otherhidden Markov models
dc.subject.otheracoustic localization
dc.titleImproving detection of acoustic events using audiovisual data and feature level fusion
dc.typeConference report
dc.subject.lemacReconeixement automàtic de la parla
dc.subject.lemacVídeo digital
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.contributor.groupUniversitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
dc.relation.publisherversionhttp://www.isca-speech.org/archive/interspeech_2009/i09_1147.html
dc.rights.accessOpen Access
local.identifier.drac2524256
dc.description.versionPostprint (published version)
local.citation.authorButko, T.; Canton, C.; Segura, C.; Giro, X.; Nadeu, C.; Hernando, J.; Casas, J.
local.citation.contributorAnnual Conference of the International Speech Communication Association
local.citation.publicationNameISCA-INST Speech Communication Association
local.citation.startingPage1147
local.citation.endingPage1150


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder