Browsing by Author "Hernando Pericás, Francisco Javier"
Now showing items 1-20 of 132
-
3D joint speaker position and orientation tracking with particle filters
Segura, Carlos; Hernando Pericás, Francisco Javier (2014-01-29)
Article
Open AccessThis paper addresses the problem of three-dimensional speaker orientation estimation in a smart-room environment equipped with microphone arrays. A Bayesian approach is proposed to jointly track the location and orientation ... -
A comparative study of parameters and distances for noisy speech recognition
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1991)
Conference report
Open Access -
A comparative study of techniques for HMM-based noisy speech recognition in noisy car environment
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent; Mariño Acebal, José Bernardo (Springer, 1993)
Conference report
Open AccessThe performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to report the ... -
A conversation analysis framework using speech recognition and naïve bayes classification for construction process monitoring
Zhang, T.; Lee, Y. C.; Zhu, Y.; Hernando Pericás, Francisco Javier (American Society of Civil Engineers (ASCE), 2018)
Conference report
Restricted access - publisher's policyAt a dynamic construction site, conversations convey vital information including construction activities, operation status, and task performance. Even though because of information security, recording the entire conversations ... -
A deep analysis on age estimation
Huerta Casado, Iván; Fernandez Tena, Carles; Segura, Carlos; Hernando Pericás, Francisco Javier; Prati, Andrea (2015-12-15)
Article
Open AccessThe automatic estimation of age from face images is increasingly gaining attention, as it facilitates applications including advanced video surveillance, demographic statistics collection, customer profiling, or search ... -
A novel method for low-constrained iris boundary localization
Fernández, Carles; Pérez, Dídac; Segura, Carlos; Hernando Pericás, Francisco Javier (2012)
Conference lecture
Open AccessIris recognition systems are strongly dependent on their segmentation processes, which have traditionally assumed rigid experimental constraints to achieve good performance, but now move towards less constrained environments. ... -
A Unified Parameterization Scheme for Noisy Speech Recognition
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (ESCA, 1997)
Conference report
Open AccessLP-based and mel-cepstrum coefficients are by far the most prevalent parameterization techniques in speech recognition. The conventional LP technique is known to be very sensitive to the presence of additive noise and there ... -
Accelerating boosting-based face detection on GPUs
Oro, David; Fernández, Carles; Segura, Carlos; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier (2012)
Conference report
Restricted access - publisher's policyThe goal of face detection is to determine the presence of faces in arbitrary images, along with their locations and dimensions. As it happens with any graphics workloads, these algorithms benefit from data-level ... -
Acoustic event detection based on feature-level fusion of audio and video modalities
Butko, Taras; Canton Ferrer, Cristian; Segura Perales, Carlos; Giró Nieto, Xavier; Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier; Casas Pla, Josep Ramon (HINDAWI, 2011-03-15)
Article
Open AccessAcoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a ... -
Actividades en tratamiento de voz del Grupo de Procesado de Señal
Hernando Pericás, Francisco Javier (Escola Tècnica Superior d'Enginyers de Telecomunicació de Barcelona, 1993)
Article
Open Access -
Albayzin 2010 Evaluation campaign: speaker diarization
Zelenak, Martin; Schulz, Henrik; Hernando Pericás, Francisco Javier (2010)
Conference lecture
Open AccessIn this paper we present the evaluation results for the task of speaker diarization in broadcast news domain as part of the Albayzin 2010 evaluation campaign of language and speech technologies. The evaluation data was ... -
AR modeling of the speech autocorrelation to improve noisy speech recognition
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1992)
Conference report
Open AccessSpeech recognition in noisy environments remains an unsolved problem even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, ... -
Audiovisual event detection towards scene understanding
Canton Ferrer, Cristian; Butko, Taras; Segura, C.; Giró Nieto, Xavier; Nadeu Camprubí, Climent; Hernando Pericás, Francisco Javier; Casas Pla, Josep Ramon (Institute of Electrical and Electronics Engineers (IEEE), 2009)
Conference report
Restricted access - publisher's policyAcoustic events produced in meeting environments may contain useful information for perceptually aware interfaces and multimodal behavior analysis. In this paper, a system to detect and recognize these events from a ... -
Audiovisual head orientation estimation with particle filtering in multisensor scenarios
Canton Ferrer, Cristian; Segura Perales, Carlos; Casas Pla, Josep Ramon; Pardàs Feliu, Montse; Hernando Pericás, Francisco Javier (2008-01)
Article
Open AccessThis article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals ... -
Auto-encoding nearest neighbor i-vectors for speaker verification
Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
Conference lecture
Open AccessIn the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ... -
Automatic speaker recognition as a measurement of voice imitation and conversion
Farrús Cabeceran, Mireia; Wagner, Michael; Erro Eslava, Daniel; Hernando Pericás, Francisco Javier (2010-01-01)
Article
Open AccessVoices can be deliberately disguised by means of human imitation or voice conversion. The question arises to what extent they can be modified by using either method. In the current paper, a set of speaker identification ... -
Bi-Gaussian score equalization in an audio-visual SVM-based person verification system
Ejarque, Pascual; Hernando Pericás, Francisco Javier (2008)
Conference report
Open AccessIn multimodal fusion systems a normalization of the features or the scores is needed before the fusion process. In this work, in addition to the conventional methods, histogram equalization, which was recently introduced ... -
CDHMM speaker recognition by means of frequency filtering of filter-bank energies
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1997)
Conference report
Open AccessRecently, the set of spectral parameters of every speech frame that result from filtering the frequency sequence of mel-scaled filter-bank energies with a simple first-order high-pass FIR filter have proved to be an efficient ... -
Clustering initialization based on spatial information for speaker diarization of meetings
Luque Serrano, Jordi; Segura, C.; Hernando Pericás, Francisco Javier (2008)
Conference report
Restricted access - publisher's policyThis paper proposes an initialization for an agglomerative system applied to speaker diarization in the meeting environment. The initialization is based on a previous clustering of the temporal sequence generated by the ... -
Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR
Macho, D; Nadeu Camprubí, Climent; Jancovic, P; Rozinaj, G; Hernando Pericás, Francisco Javier (1999)
Conference report
Open AccessIn current speech recognition systems, speech is represented by a 2-D sequence of parameters that model the temporal evolution of the spectral envelope of speech. Linear transformation or filtering along both time and ...