• A hierarchical architecture with feature selection for audio segmentation in a broadcast news domain 

      Butko, Taras; Nadeu Camprubí, Climent (2010)
      Text en actes de congrés
      Accés obert
      This work presents a hierarchical HMM-based audio segmentation system with feature selection designed for the Albayzin 2010 Evaluations. We propose an architecture that combines the outputs of individual binary detectors ...
    • Albayzin-2010 audio segmentation evaluation: evaluation setup and results 

      Butko, Taras; Nadeu Camprubí, Climent; Schulz, Henrik (2010)
      Text en actes de congrés
      Accés obert
      In this paper, we present the audio segmentation task from the Albayzín-2010 evaluation, and the results obtained by the eight participants from Spanish and Portuguese universities. The evaluation task consisted of the ...
    • Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion 

      Butko, Taras; Nadeu Camprubí, Climent (HINDAWI, 2011-06-17)
      Article
      Accés obert
      Recently, audio segmentation has attracted research interest because of its usefulness in several applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Moreover, a previous audio ...
    • Feature selection for multimodal: acoustic event detection 

      Butko, Taras (Universitat Politècnica de Catalunya, 2011-07-08)
      Tesi
      Accés obert
      The detection of the Acoustic Events (AEs) naturally produced in a meeting room may help to describe the human and social activity. The automatic description of interactions between humans and environment can be useful for ...
    • SHAS: approaching optimal segmentation for end-to-end speech translation 

      Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (2022-02)
      Report de recerca
      Accés obert
      Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...
    • SHAS: approaching optimal segmentation for end-to-end speech translation 

      Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (International Speech Communication Association (ISCA), 2022)
      Text en actes de congrés
      Accés obert
      Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...