Ara es mostren els items 18-37 de 133

    • CDHMM speaker recognition by means of frequency filtering of filter-bank energies 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1997)
      Text en actes de congrés
      Accés obert
      Recently, the set of spectral parameters of every speech frame that result from filtering the frequency sequence of mel-scaled filter-bank energies with a simple first-order high-pass FIR filter have proved to be an efficient ...
    • Clustering initialization based on spatial information for speaker diarization of meetings 

      Luque Serrano, Jordi; Segura, C.; Hernando Pericás, Francisco Javier (2008)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      This paper proposes an initialization for an agglomerative system applied to speaker diarization in the meeting environment. The initialization is based on a previous clustering of the temporal sequence generated by the ...
    • Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR 

      Macho, D; Nadeu Camprubí, Climent; Jancovic, P; Rozinaj, G; Hernando Pericás, Francisco Javier (1999)
      Text en actes de congrés
      Accés obert
      In current speech recognition systems, speech is represented by a 2-D sequence of parameters that model the temporal evolution of the spectral envelope of speech. Linear transformation or filtering along both time and ...
    • Comportamiento de la transformacion bilineal de frecuencias en reconocimiento de habla ruidosa 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent; Riu, D. (. AERFAI, 1992)
      Text en actes de congrés
      Accés obert
    • Comportamiento de la transformación bilineal de frecuencias en reconocimiento de habla ruidosa 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1992)
      Text en actes de congrés
      Accés obert
    • Corpus selection 

      Adda, Gilles; Barras, Claude; Kernal Ekenel, Hazim; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier (2013-03-31)
      Report de recerca
      Accés obert
      Entregable del proyecto Collaborative Annotation of multi-MOdal, MultI-Lingual and multi-mEdia documents. This document describes the different corpora that will be used during the Camomile project
    • Deep belief networks for i-vector based speaker recognition 

      Ghahabi Esfahani, Omid; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The use of Deep Belief Networks (DBNs) is proposed in this paper to model discriminatively target and impostor i-vectors in a speaker verification task. The authors propose to adapt the network parameters of each speaker ...
    • Deep learning backend for single and multisession i-vector speaker recognition 

      Ghahabi Esfahani, Omid; Hernando Pericás, Francisco Javier (2017-04-01)
      Article
      Accés obert
      The lack of labeled background data makes a big performance gap between cosine and Probabilistic Linear Discriminant Analysis (PLDA) scoring baseline techniques for i-vectors in speaker recognition. Although there are some ...
    • Deep neural networks for i-vector language identification of short utterances in cars 

      Ghahabi Esfahani, Omid; Bonafonte Cávez, Antonio; Hernando Pericás, Francisco Javier; Moreno Bilbao, M. Asunción (International Speech Communication Association (ISCA), 2016)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      This paper is focused on the application of the Language Identification (LID) technology for intelligent vehicles. We cope with short sentences or words spoken in moving cars in four languages: English, Spanish, German, ...
    • Detection and handling of overlapping speech for speaker diarization 

      Zelenak, Martin; Hernando Pericás, Francisco Javier (2012)
      Text en actes de congrés
      Accés obert
      This thesis concerns the detection of overlapping speech segments and its further application for the improvement of speaker diarization performance. We propose the use of three spatial cross-correlation-based parameters ...
    • Discriminación robusta de locutores 

      Hernando Pericás, Francisco Javier (1996)
      Text en actes de congrés
      Accés obert
    • Discriminative weighting of dynamic feautres in continuous-density hidden Markov models for word recognition 

      Hernando Pericás, Francisco Javier (1995)
      Text en actes de congrés
      Accés obert
      Speech dynamic features, which provide smoothed estimates of the derivatives of the spectral parameter trajectories in the current frame, are routinely used in current speech recognition systems in combination with short-term ...
    • DNN speaker embeddings using autoencoder pre-training 

      Khan, Umair; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      Over the last years, i-vectors have been the state-of-the-art approach in speaker recognition. Recent improvements in deep learning have increased the discriminative quality of i-vectors. However, deep learning architectures ...
    • Double multi-head attention for speaker verification 

      India Massana, Miquel Àngel; Safari, Pooyan; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2021)
      Text en actes de congrés
      Accés obert
      Most state-of-the-art Deep Learning systems for text-independent speaker verification are based on speaker embedding extractors. These architectures are commonly composed of a feature extractor front-end together with a ...
    • Dynamic time warping applied to detection of confusable word pairs in automatic speech recognition 

      Anguita Ortega, Jan; Hernando Pericás, Francisco Javier (Escola Tècnica Superior d'Enginyers de Telecomunicació de Barcelona, 2005)
      Article
      Accés obert
      In this paper we present a rnethod to predict if two words are likely to be confused by an Autornatic SpeechRecognition (ASR) systern. This method is based on the c1assical Dynamic Time Warping (DTW) technique. This ...
    • End-to-end transparent user identification using touchscreen biometrics 

      Krzeminski, Michal; Hernando Pericás, Francisco Javier (Universidad de Málaga, 2020)
      Text en actes de congrés
      Accés obert
      We study the touchscreen data as behavioral biometrics. The goal was to create an end-to-end system that can transparently identify users using raw data from mobile devices. The touchscreen biometrics was researched only ...
    • Esquema unificado de parametrización de la señal de voz en reconocimiento del habla 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent; Vallverdú Bayés, Sisco (Universidad de Valladolid, 1995)
      Text en actes de congrés
      Accés obert
      A correct choice of voice signal modeling methods is essential to obtain good results in automatic speech recognition. In this paper, we have proposed a unified view of the speech parametrization stage, in which conventional ...
    • Estudio comparativo y nuevas propuestas de tecnicas de parametrizacion de la señal de voz para el reconocimiento del habla 

      Clot, J; Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1994)
      Text en actes de congrés
      Accés obert
      A correct choice of voice signal modeling method is essential to obtain good results in automatic speech recogniton. In this paper, a comparative study betwen two speech signal models, Linear Prediction Coeficients and ...
    • Examen Final 

      Oliveras Vergés, Albert; Hernando Pericás, Francisco Javier (Universitat Politècnica de Catalunya, 2013-01-17)
      Examen
      Accés restringit a la comunitat UPC
    • Examen Final 

      Oliveras Vergés, Albert; Mariño Acebal, José Bernardo; Hernando Pericás, Francisco Javier; Villares Piera, Nemesio Javier (Universitat Politècnica de Catalunya, 2014-06-10)
      Examen
      Accés restringit a la comunitat UPC