VEU - Grup de Tractament de la Parla
http://hdl.handle.net/2117/3746
Sat, 01 Oct 2016 22:26:46 GMT2016-10-01T22:26:46ZQuantification of survey expectations by means of symbolic regression via genetic programming to estimate economic growth in central and eastern european economies
http://hdl.handle.net/2117/90011
Quantification of survey expectations by means of symbolic regression via genetic programming to estimate economic growth in central and eastern european economies
Claveria, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador
Tendency surveys are the main source of agents' expectations. This study has a twofold aim. First, it proposes a new method to quantify survey-based expectations by means of symbolic regression (SR) via genetic programming. Second, it combines the main SR-generated indicators to estimate the evolution of GDP, obtaining the best results for the Czech Republic and Hungary. Finally, it assesses the impact of the 2008 financial crisis, finding that the capacity of agents' expectations to anticipate economic growth in most Central and Eastern European economies improved after the crisis.
Fri, 16 Sep 2016 18:18:38 GMThttp://hdl.handle.net/2117/900112016-09-16T18:18:38ZClaveria, OscarMonte Moreno, EnriqueTorra Porras, SalvadorTendency surveys are the main source of agents' expectations. This study has a twofold aim. First, it proposes a new method to quantify survey-based expectations by means of symbolic regression (SR) via genetic programming. Second, it combines the main SR-generated indicators to estimate the evolution of GDP, obtaining the best results for the Czech Republic and Hungary. Finally, it assesses the impact of the 2008 financial crisis, finding that the capacity of agents' expectations to anticipate economic growth in most Central and Eastern European economies improved after the crisis.Modelling cross-dependencies between Spain’s regional tourism markets with an extension of the Gaussian process regression model
http://hdl.handle.net/2117/90005
Modelling cross-dependencies between Spain’s regional tourism markets with an extension of the Gaussian process regression model
Claveria, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador
This study presents an extension of the Gaussian process regression model for multiple-input multiple-output forecasting. This approach allows modelling the cross-dependencies between a given set of input variables and generating a vectorial prediction. Making use of the existing correlations in international tourism demand to all seventeen regions of Spain, the performance of the proposed model is assessed in a multiple-step-ahead forecasting comparison. The results of the experiment in a multivariate setting show that the Gaussian process regression model significantly improves the forecasting accuracy of a multi-layer perceptron neural network used as a benchmark. The results reveal that incorporating the connections between different markets in the modelling process may prove very useful to refine predictions at a regional level.
Fri, 16 Sep 2016 17:51:13 GMThttp://hdl.handle.net/2117/900052016-09-16T17:51:13ZClaveria, OscarMonte Moreno, EnriqueTorra Porras, SalvadorThis study presents an extension of the Gaussian process regression model for multiple-input multiple-output forecasting. This approach allows modelling the cross-dependencies between a given set of input variables and generating a vectorial prediction. Making use of the existing correlations in international tourism demand to all seventeen regions of Spain, the performance of the proposed model is assessed in a multiple-step-ahead forecasting comparison. The results of the experiment in a multivariate setting show that the Gaussian process regression model significantly improves the forecasting accuracy of a multi-layer perceptron neural network used as a benchmark. The results reveal that incorporating the connections between different markets in the modelling process may prove very useful to refine predictions at a regional level.Subband splitting, adaptive scalar prediction and vector quantization for speech coding
http://hdl.handle.net/2117/89167
Subband splitting, adaptive scalar prediction and vector quantization for speech coding
Masgrau Gómez, Enrique José; Rodríguez Fonollosa, José Adrián; Mariño Acebal, José Bernardo
This paper describes a new coding structure based on the combination of Vector Quantizati.on, Linear Prediction l)nd Subband Splitting that achieves high guality speech at rates below 10 Kbit/sec. In this scheme, a vector is formed with one sanple of the normalized prediction error of each band and then a vector quanti.zer is applied to it. This guantization of the prediction error a.llows to use scalar adaptive predictors while conserving the advantages of the vector guantization. The necessary noise shap~ for achjev:ing high subjective quality is obtained by the use of a Freguency-Weighted distance in thc vecto r guantizer
Mon, 25 Jul 2016 13:13:55 GMThttp://hdl.handle.net/2117/891672016-07-25T13:13:55ZMasgrau Gómez, Enrique JoséRodríguez Fonollosa, José AdriánMariño Acebal, José BernardoThis paper describes a new coding structure based on the combination of Vector Quantizati.on, Linear Prediction l)nd Subband Splitting that achieves high guality speech at rates below 10 Kbit/sec. In this scheme, a vector is formed with one sanple of the normalized prediction error of each band and then a vector quanti.zer is applied to it. This guantization of the prediction error a.llows to use scalar adaptive predictors while conserving the advantages of the vector guantization. The necessary noise shap~ for achjev:ing high subjective quality is obtained by the use of a Freguency-Weighted distance in thc vecto r guantizerWideband-speech APVQ coding from 16 to 32 KBPS
http://hdl.handle.net/2117/89082
Wideband-speech APVQ coding from 16 to 32 KBPS
Salavedra Molí, Josep
This paper describes a coding scheme for broadband speech (sampling frequency 16KHz). We present a wideband speech encoder called APVQ (Adaptive Predictive Vector Quantization). It combines Subband Coding, Vector Quantization and Adaptive Prediction as it is represented in Fig.1. Speech signal is split in 16 subbands by means of a QMF filter bank and so every subband is 500Hz wide. This APVQ encoder can be seen either as a vectorial extension of a conventional ADPCM encoder or as a scalar Subband AVPC encoder [1],[3]. In this scheme, signal vector is formed with one sample of the normalized prediction error signal coming from different subbands and then it is vector quantized. Prediction error signal is normalized by its gain and normalized prediction error signal is the input of the VQ and therefore an adaptive Gain-Shape VQ is considered. This APVQ Encoder combines the advantages of Scalar Prediction and those of Vector Quantization. We evaluate wideband speech coding in the range from 1 to 2 bits/sample.
Fri, 22 Jul 2016 10:40:07 GMThttp://hdl.handle.net/2117/890822016-07-22T10:40:07ZSalavedra Molí, JosepThis paper describes a coding scheme for broadband speech (sampling frequency 16KHz). We present a wideband speech encoder called APVQ (Adaptive Predictive Vector Quantization). It combines Subband Coding, Vector Quantization and Adaptive Prediction as it is represented in Fig.1. Speech signal is split in 16 subbands by means of a QMF filter bank and so every subband is 500Hz wide. This APVQ encoder can be seen either as a vectorial extension of a conventional ADPCM encoder or as a scalar Subband AVPC encoder [1],[3]. In this scheme, signal vector is formed with one sample of the normalized prediction error signal coming from different subbands and then it is vector quantized. Prediction error signal is normalized by its gain and normalized prediction error signal is the input of the VQ and therefore an adaptive Gain-Shape VQ is considered. This APVQ Encoder combines the advantages of Scalar Prediction and those of Vector Quantization. We evaluate wideband speech coding in the range from 1 to 2 bits/sample.Robust hos-based techniques applied to speech recognition and enhancement
http://hdl.handle.net/2117/89081
Robust hos-based techniques applied to speech recognition and enhancement
Salavedra Molí, Josep; Hernando Pericás, Francisco Javier; Masgrau Gómez, Enrique José; Moreno Bilbao, M. Asunción
We study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors, where information of previous speech frames is taken to initiate speech AR modelling of the current frame. Two parameters are introduced to dessign Wiener filter at first iteration of this iterative algorithm. These parameters are the Interframe Factor (IF) and the Previous Frame Iteration (PFI). A detailed study of them shows they allow a very important noise suppression after processing only first iteration of this algorithm, without any appreciable increase of distortion. Finally, the simplest cumulant-based algorithm is applied to Speech Recognition and some preliminary results are presented.
Fri, 22 Jul 2016 10:39:03 GMThttp://hdl.handle.net/2117/890812016-07-22T10:39:03ZSalavedra Molí, JosepHernando Pericás, Francisco JavierMasgrau Gómez, Enrique JoséMoreno Bilbao, M. AsunciónWe study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors, where information of previous speech frames is taken to initiate speech AR modelling of the current frame. Two parameters are introduced to dessign Wiener filter at first iteration of this iterative algorithm. These parameters are the Interframe Factor (IF) and the Previous Frame Iteration (PFI). A detailed study of them shows they allow a very important noise suppression after processing only first iteration of this algorithm, without any appreciable increase of distortion. Finally, the simplest cumulant-based algorithm is applied to Speech Recognition and some preliminary results are presented.A speech enhancement system using higher order ar estimation in real environments
http://hdl.handle.net/2117/89080
A speech enhancement system using higher order ar estimation in real environments
Salavedra Molí, Josep; Masgrau Gómez, Enrique José; Moreno Bilbao, M. Asunción
We study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors, providing a behavior comparison between cumulant algorithms and classical auto-correlation one. Some results are presented considering AWGN that allows the best improvement and those noises (diesel engine and reactor noises) that leads to the worst one. An exhaustive empirical test shows that cumulant algorithms outperform the original autocorrelation algorithm, specially at low SNR.
Fri, 22 Jul 2016 10:37:51 GMThttp://hdl.handle.net/2117/890802016-07-22T10:37:51ZSalavedra Molí, JosepMasgrau Gómez, Enrique JoséMoreno Bilbao, M. AsunciónWe study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors, providing a behavior comparison between cumulant algorithms and classical auto-correlation one. Some results are presented considering AWGN that allows the best improvement and those noises (diesel engine and reactor noises) that leads to the worst one. An exhaustive empirical test shows that cumulant algorithms outperform the original autocorrelation algorithm, specially at low SNR.Multiple multilabeling applied to HMM-based noisy speech recognition
http://hdl.handle.net/2117/89030
Multiple multilabeling applied to HMM-based noisy speech recognition
Hernando Pericás, Francisco Javier; Mariño Acebal, José Bernardo; Moreno Bilbao, M. Asunción; Nadeu Camprubí, Climent
The performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to propose the application of a simple multilabeling method, instead of the standard vector quantization -so called labeling-, as the front end for a speech recognizer based on the Vector Quantization (VQ) and Hidden Markov Models (HMM) approaches in order to increase its robustness to noise. Furthermore, not only cepstrum but also other features such as energy and dynamic parameters are evaluated and quantized independently in the multilabeling stage to represent more accurately characteristics of speech. The result of this process is a multiple multilabeling. Experimental results in the presence of additive white noise clearly demonstrate its good performance in isolated word recognition in noisy environments.
Thu, 21 Jul 2016 10:45:41 GMThttp://hdl.handle.net/2117/890302016-07-21T10:45:41ZHernando Pericás, Francisco JavierMariño Acebal, José BernardoMoreno Bilbao, M. AsunciónNadeu Camprubí, ClimentThe performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to propose the application of a simple multilabeling method, instead of the standard vector quantization -so called labeling-, as the front end for a speech recognizer based on the Vector Quantization (VQ) and Hidden Markov Models (HMM) approaches in order to increase its robustness to noise. Furthermore, not only cepstrum but also other features such as energy and dynamic parameters are evaluated and quantized independently in the multilabeling stage to represent more accurately characteristics of speech. The result of this process is a multiple multilabeling. Experimental results in the presence of additive white noise clearly demonstrate its good performance in isolated word recognition in noisy environments.Speaker verification on the polycost database using frequency filtered spectral energies
http://hdl.handle.net/2117/89012
Speaker verification on the polycost database using frequency filtered spectral energies
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent
The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a first or second order FIR filter have proved to be competitive for speech recognition. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using mel-cepstrum were observed on the TIMIT database, especially when white noise was added. In this paper, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering is explored for speaker verification. This combination, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POL YCOST database.
Thu, 21 Jul 2016 09:02:41 GMThttp://hdl.handle.net/2117/890122016-07-21T09:02:41ZHernando Pericás, Francisco JavierNadeu Camprubí, ClimentThe spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a first or second order FIR filter have proved to be competitive for speech recognition. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using mel-cepstrum were observed on the TIMIT database, especially when white noise was added. In this paper, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering is explored for speaker verification. This combination, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POL YCOST database.Reconocimiento del habla en ambientes ruidosos mediante modelos ocultos de Markov discretos
http://hdl.handle.net/2117/89011
Reconocimiento del habla en ambientes ruidosos mediante modelos ocultos de Markov discretos
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent
Speech recognition in noisy environments remains an unsolved problem, even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, the Short-Time Modified Coherence (SMC) parameterization and the Cepstral Projection Distortion (CPD) measure have shown excellent results when tested in a speech recognition system based on Dynamic Time Warping (DTW) and using speech contaminated by additive white noise. In this paper, a new technique based on the AR modeling of the one-sided autocorrelation sequence (OSALPC) is presented and, from a comparative study of these LPC-based techniques in the discrete Hidden Markov Model (DHMM) approach, two main conclusions are attained: 1) the slope cepstral window and a relatively high model order are preferable, and 2) the cepstral representation based on the autocorrelation modeling achieves excellent results.
Thu, 21 Jul 2016 09:00:38 GMThttp://hdl.handle.net/2117/890112016-07-21T09:00:38ZHernando Pericás, Francisco JavierNadeu Camprubí, ClimentSpeech recognition in noisy environments remains an unsolved problem, even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, the Short-Time Modified Coherence (SMC) parameterization and the Cepstral Projection Distortion (CPD) measure have shown excellent results when tested in a speech recognition system based on Dynamic Time Warping (DTW) and using speech contaminated by additive white noise. In this paper, a new technique based on the AR modeling of the one-sided autocorrelation sequence (OSALPC) is presented and, from a comparative study of these LPC-based techniques in the discrete Hidden Markov Model (DHMM) approach, two main conclusions are attained: 1) the slope cepstral window and a relatively high model order are preferable, and 2) the cepstral representation based on the autocorrelation modeling achieves excellent results.Comportamiento de la transformación bilineal de frecuencias en reconocimiento de habla ruidosa
http://hdl.handle.net/2117/89010
Comportamiento de la transformación bilineal de frecuencias en reconocimiento de habla ruidosa
Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent
Thu, 21 Jul 2016 08:57:19 GMThttp://hdl.handle.net/2117/890102016-07-21T08:57:19ZHernando Pericás, Francisco JavierNadeu Camprubí, Climent