Ir al contenido (pulsa Retorno)

Universitat Politècnica de Catalunya

    • Català
    • Castellano
    • English
    • LoginRegisterLog in (no UPC users)
  • mailContact Us
  • world English 
    • Català
    • Castellano
    • English
  • userLogin   
      LoginRegisterLog in (no UPC users)

UPCommons. Global access to UPC knowledge

Banner header
61.694 UPC E-Prints
You are here:
View Item 
  •   DSpace Home
  • E-prints
  • Grups de recerca
  • VEU - Grup de Tractament de la Parla
  • Articles de revista
  • View Item
  •   DSpace Home
  • E-prints
  • Grups de recerca
  • VEU - Grup de Tractament de la Parla
  • Articles de revista
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Acoustic event detection based on feature-level fusion of audio and video modalities

Thumbnail
View/Open
485738.pdf (2,194Mb)
 
10.1155/2011/485738
 
  View Usage Statistics
  LA Referencia / Recolecta stats
Cita com:
hdl:2117/13630

Show full item record
Butko, Taras
Canton Ferrer, Cristian
Segura Perales, Carlos
Giró Nieto, XavierMés informacióMés informació
Nadeu Camprubí, ClimentMés informacióMés informacióMés informació
Hernando Pericás, Francisco JavierMés informacióMés informacióMés informació
Casas Pla, Josep RamonMés informacióMés informacióMés informació
Document typeArticle
Defense date2011-03-15
PublisherHINDAWI
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a large amount of errors, which are mostly due to temporal overlaps. Actually, temporal overlaps accounted for more than 70% of errors in the realworld interactive seminar recordings used in CLEAR 2007 evaluations. In this paper, we improve the recognition rate of acoustic events using information from both audio and video modalities. First, the acoustic data are processed to obtain both a set of spectrotemporal features and the 3D localization coordinates of the sound source. Second, a number of features are extracted from video recordings by means of object detection, motion analysis, and multicamera person tracking to represent the visual counterpart of several acoustic events. A feature-level fusion strategy is used, and a parallel structure of binary HMM-based detectors is employed in our work. The experimental results show that information from both the microphone array and video cameras is useful to improve the detection rate of isolated as well as spontaneously generated acoustic events.
Description
Research article
CitationButko, T. [et al.]. Acoustic event detection based on feature-level fusion of audio and video modalities. "Eurasip journal on advances in signal processing", 15 Març 2011, vol. 2011, p. 1-11. 
URIhttp://hdl.handle.net/2117/13630
DOI10.1155/2011/485738
ISSN1687-6172
Publisher versionhttp://www.hindawi.com/journals/asp/2011/485738/
Collections
  • VEU - Grup de Tractament de la Parla - Articles de revista [172]
  • Departament de Teoria del Senyal i Comunicacions - Articles de revista [2.457]
  • GPI - Grup de Processament d'Imatge i Vídeo - Articles de revista [118]
  View Usage Statistics

Show full item record

FilesDescriptionSizeFormatView
485738.pdf2,194MbPDFView/Open

Browse

This CollectionBy Issue DateAuthorsOther contributionsTitlesSubjectsThis repositoryCommunities & CollectionsBy Issue DateAuthorsOther contributionsTitlesSubjects

© UPC Obrir en finestra nova . Servei de Biblioteques, Publicacions i Arxius

info.biblioteques@upc.edu

  • About This Repository
  • Contact Us
  • Send Feedback
  • Privacy Settings
  • Inici de la pàgina