Exploració per tema "Audio segmentation"

A hierarchical architecture with feature selection for audio segmentation in a broadcast news domain

Butko, Taras; Nadeu Camprubí, Climent (2010)
Text en actes de congrés
Accés obert

This work presents a hierarchical HMM-based audio segmentation system with feature selection designed for the Albayzin 2010 Evaluations. We propose an architecture that combines the outputs of individual binary detectors ...

Albayzin-2010 audio segmentation evaluation: evaluation setup and results

Butko, Taras; Nadeu Camprubí, Climent; Schulz, Henrik (2010)
Text en actes de congrés
Accés obert

In this paper, we present the audio segmentation task from the Albayzín-2010 evaluation, and the results obtained by the eight participants from Spanish and Portuguese universities. The evaluation task consisted of the ...

Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion

Butko, Taras; Nadeu Camprubí, Climent (HINDAWI, 2011-06-17)
Article
Accés obert

Recently, audio segmentation has attracted research interest because of its usefulness in several applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Moreover, a previous audio ...

Feature selection for multimodal: acoustic event detection

Butko, Taras (Universitat Politècnica de Catalunya, 2011-07-08)
Tesi
Accés obert

The detection of the Acoustic Events (AEs) naturally produced in a meeting room may help to describe the human and social activity. The automatic description of interactions between humans and environment can be useful for ...

SHAS: approaching optimal segmentation for end-to-end speech translation

Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (2022-02)
Report de recerca
Accés obert

Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...

SHAS: approaching optimal segmentation for end-to-end speech translation

Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (International Speech Communication Association (ISCA), 2022)
Text en actes de congrés
Accés obert

Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...