Now showing items 41-60 of 113

  • i-Vector modeling with deep belief networks for multi-session speaker recognition 

    Ghahabi Esfahani, Omid; Hernando Pericás, Francisco Javier (2014)
    Conference report
    Restricted access - publisher's policy
    In this paper we propose an impostor selection method for a Deep Belief Network (DBN) based system which models i-vectors in a multi-session speaker verification task. In the proposed method, instead of choosing a ...
  • Jitter and Shimmer measurements for speaker diarization 

    Zewoudie, Abraham Woubie; Luque, Jordi; Hernando Pericás, Francisco Javier (2014)
    Conference report
    Open Access
    Jitter and shimmer voice quality features have been successfully used to characterize speaker voice traits and detect voice pathologies. Jitter and shimmer measure variations in the fundamental frequency and amplitude ...
  • Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition 

    Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1997-01)
    Article
    Open Access
    The article presents a robust representation of speech based on AR modeling of the causal part of the autocorrelation sequence. In noisy speech recognition, this new representation achieves better results than several other ...
  • Loquax: implementación de un sistema de reconocimiento de locutor en un ordenador personal 

    Jonatan, Lopez; Hernando Pericás, Francisco Javier (1996)
    Conference report
    Open Access
    Sistematizar el reconocimiento de locutor, es decir, la capacidad de distinguir el propietario o propietaria de un fragmento de voz humana, es un objetivo perseguido desde los inicios del procesado de la señal y enmarcado ...
  • LSTM neural network-based speaker segmentation using acoustic and language modelling 

    India Massana, Miquel Àngel; Rodríguez Fonollosa, José Adrián; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2017)
    Conference lecture
    Open Access
    This paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different ...
  • Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition 

    Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 1997)
    Conference report
    Open Access
    Speech dynamic features are routinely used in current speech recognition systems in combination with short-term (static) spectral features. Although many existing speech recognition systems do not weight both kinds of ...
  • Modelado de la señal en reconocimiento de habla ruidosa 

    Pascual, E; Hernando Pericás, Francisco Javier; Mariño Acebal, José Bernardo; Gustavo, H (1996)
    Conference report
    Open Access
    Conventional modelling techniques of speech suffer a very big performance degradation in adverse noisy environments. So, it is necessary to research for more robust representations of speech signal. This paper presents new ...
  • Modelado de la trayectoria de los polos en la secuencia de LPC 

    Freitag, Fèlix; Monte Moreno, Enrique; Hernando Pericás, Francisco Javier (Universidad de Valladolid, 1995)
    Conference report
    Open Access
    A alternative way of representing time variations of the speech spectra is presented. We propose to model the trajectories of the poles of the LPC analysis spectra using exponential functions as alternative to delta ...
  • Modelling of the analytic spectrum for speech recognition 

    Nadeu Camprubí, Climent; Lleida, E; Hernando Pericás, Francisco Javier (European Speech Communication Association (ESCA), 1989)
    Conference report
    Open Access
    In this paper, a new spectral representation is introduced and applied to speech recognition. As the widely used LPC autocorrelation technique, it arises from an optimization approach that starts from a set of M+ 1 ...
  • Multimodal identification and localization of users in a smart environment 

    Salah, Albert Ali; Morros Rubió, Josep Ramon; Luque, Jordi; Segura Perales, Carlos; Hernando Pericás, Francisco Javier; Ambekar, Onkar; Schouten, Ben; Pauwels, Eric (2008-09)
    Article
    Open Access
    Detecting the location and identity of users is a first step in creating contextaware applications for technologically-endowed environments. We propose a system that makes use of motion detection, person tracking, face ...
  • Multiple multilabeling applied to HMM-based noisy speech recognition 

    Hernando Pericás, Francisco Javier; Mariño Acebal, José Bernardo; Moreno Bilbao, M. Asunción; Nadeu Camprubí, Climent (1993)
    Conference lecture
    Open Access
    The performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to propose ...
  • Multiple multilabeling to improve HMM-based speech recognition in noise 

    Hernando Pericás, Francisco Javier; Mariño Acebal, José Bernardo; Nadeu Camprubí, Climent (1993)
    Conference report
    Open Access
    The performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to propose ...
  • Multiple multilabelling applied to hmm-based noisy speech recognition 

    Hernando Pericás, Francisco Javier; Mariño Acebal, José Bernardo; Moreno Bilbao, M. Asunción; Nadeu Camprubí, Climent (CHINESE INSTITUTE OF ELECTRONICS, 1993)
    Conference report
    Open Access
  • New approaches for iris boundary localization 

    Pérez, Dídac; Fernández, Carles; Segura, Carlos; Hernando Pericás, Francisco Javier (Universidad de las Palmas de Gran Canarias, 2012)
    Conference report
    Open Access
    Iris segmentation is the most determining factor in iris biometrics, which has traditionally assumed rigid constrained environments. In this work, a novel method that covers the localization of the pupillaty and limbic ...
  • On the AR modelling of the one-sided autocorrelation sequence for noisy speech recognition 

    Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (University of Alberta, 1992)
    Conference report
    Open Access
    Speech recognition in noisy environments remains an unsolved problem even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, ...
  • On the decorrelation of filter-bank energies in speech recognition 

    Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier; Gorricho, M (1995)
    Conference report
    Open Access
    Cepstral coefficients are widely used in speech recognition. In this paper, we claim that they are not the best way of representing the spectral envelope, at least for some usual speech recognition systems. In fact, cepstrum ...
  • On the improvement of speaker diarization by detecting overlapped speech 

    Hernando Pericás, Francisco Javier; Hernando Pericás, Francisco Javier (2010)
    Conference lecture
    Open Access
    Simultaneous speech in meeting environment is responsible for a certain amount of errors caused by standard speaker diarization systems. We are presenting an overlap detection system for far-field data based on spectral ...
  • On the use of agglomerative and spectral clustering in speaker diarization of meetings 

    Hernando Pericás, Francisco Javier (2012)
    Conference report
    Restricted access - publisher's policy
    In this paper, we present a clustering algorithm for speaker diarization based on spectral clustering. State-of-the-art diariza- tion systems are based on agglomerative hierarchical clustering using Bayesian Information ...
  • On the use of filter bank energies driven from the osa sequence for noisy speech recognition 

    Hernando Pericás, Francisco Javier (INSTITUTE OF ACOUSTICS, 2000)
    Conference report
    Open Access
    epresentation of speech signal has shown to be attractive for noisy speech recognition because of both its high recognition performance with respect to the conventional LP in severe conditions of additive broad-band noise ...
  • On the use of the derivative of pole trajectories of the LPC analysis parameter sequence as an alternative to delta parameters 

    Freitag, Fèlix; Monte Moreno, Enrique; Hernando Pericás, Francisco Javier (1995)
    Conference report
    Open Access
    In this paper a new approach for modelling time variations in the speech spectra is presented. We propose to approximate the trajectories of the frequency and amplitude of the poles of the LPC spectra with exponential ...