• Acoustic event detection based on feature-level fusion of audio and video modalities 

      Butko, Taras; Canton Ferrer, Cristian; Segura Perales, Carlos; Giró Nieto, Xavier; Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier; Casas Pla, Josep Ramon (HINDAWI, 2011-03-15)
      Article
      Accés obert
      Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a ...
    • Audiovisual head orientation estimation with particle filtering in multisensor scenarios 

      Canton Ferrer, Cristian; Segura Perales, Carlos; Casas Pla, Josep Ramon; Pardàs Feliu, Montse; Hernando Pericás, Francisco Javier (2008-01)
      Article
      Accés obert
      This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals ...
    • Efficient keyword spotting by capturing long-range interactions with temporal lambda networks 

      Tura Vecino, Biel; Escuder Folch, Santiago; Diego, Ferran; Segura Perales, Carlos; Luque Serrano, Jordi (2021)
      Comunicació de congrés
      Accés obert
      Models based on attention mechanisms have shown unprecedented speech recognition performance. However, they are computationally expensive and unnecessarily complex for keyword spotting, a task targeted to small-footprint ...
    • Multimodal identification and localization of users in a smart environment 

      Salah, Albert Ali; Morros Rubió, Josep Ramon; Luque, Jordi; Segura Perales, Carlos; Hernando Pericás, Francisco Javier; Ambekar, Onkar; Schouten, Ben; Pauwels, Eric (2008-09)
      Article
      Accés obert
      Detecting the location and identity of users is a first step in creating contextaware applications for technologically-endowed environments. We propose a system that makes use of motion detection, person tracking, face ...
    • Overlap detection for speaker diarization by fusing spectral and spatial features 

      Zelenak, Martin; Segura Perales, Carlos; Hernando Pericás, Francisco Javier (2010)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      A substantial portion of errors of the conventional speaker diarization systems on meeting data can be accounted to overlapped speech. This paper proposes the use of several spatial features to improve speech overlap ...
    • Simultaneous speech detection with spatial features for speaker diarization 

      Zelenak, Martin; Segura Perales, Carlos; Luque, Jordi; Hernando Pericás, Francisco Javier (2012-02)
      Article
      Accés restringit per política de l'editorial
      Simultaneous speech poses a challenging problem for conventional speaker diarization systems. In meeting data, a substantial amount of missed speech error is due to speaker overlaps, since usually only one speaker label ...
    • Speaker orientation estimation based on hybridation of GCC-PHAT and HLBR 

      Segura Perales, Carlos; Abad, Alberto; Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (2008)
      Text en actes de congrés
      Accés obert
      This paper presents a novel approach to speaker orientation estimation in a SmartRoom environment equipped with multiple microphones. The ratio between the high and low band energies (HLBR) received at each microphone ...
    • Two-source acoustic event detection and localization: online implementation in a smart-room 

      Butko, Taras; Gonzalez Pla, Fran; Segura Perales, Carlos; Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier (2011)
      Comunicació de congrés
      Accés obert
      Real-time processing is a requirement for many practical signal processing applications. In this work we implemented online 2-source acoustic event detection and localization algorithms in a Smart-room, a closed space ...