Detection of overlapped acoustic events using fusion of audio and video modalities
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/12429
Tipus de documentText en actes de congrés
Data publicació2010
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Acoustic event detection (AED) may help to describe acoustic scenes, and also contribute to improve the robustness of
speech technologies. Even if the number of considered events is not large, that detection becomes a difficult task in
scenarios where the AEs are produced rather spontaneously and often overlap in time with speech. In this work, fusion of audio and video information at either feature or decision level is performed, and the results are compared for different levels of signal overlaps. The best improvement with respect to an audio-only baseline system was obtained using the featurelevel fusion technique. Furthermore, a significant recognition rate improvement is observed where the AEs are overlapped with loud speech, mainly due to the fact that the video
modality remains unaffected by the interfering sound.
CitacióButko, T.; Nadeu, C. Detection of overlapped acoustic events using fusion of audio and video modalities. A: Jornadas en Tecnología del Habla and Iberian SLTech Workshop. "VI Jornadas en Tecnología del Habla and II Iberian SLTech Workshop". 2010, p. 165-168.
Versió de l'editorhttp://fala2010.uvigo.es/images/proceedings/index.html
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
FALA2010-3.pdf | 212,6Kb | Visualitza/Obre |