Filtering the time sequences of spectral parameters for speech recognition

Nadeu Camprubí, Climent; Pachès Leal, Pau; Biing-Hwang, Juang

doi:10.1016/S0167-6393(97)00030-7

Visualitza/Obre

Filtering the time sequences of spectral parameters for speech recognition.pdf (262,2Kb) (Accés restringit) Sol·licita una còpia a l'autor

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Nadeu Camprubí, Climent

Pachès Leal, Pau

Biing-Hwang, Juang

Tipus de documentArticle

Data publicació1997-09

Condicions d'accésAccés restringit per política de l'editorial

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

In automatic speech recognition, the signal is usually represented by a set of time sequences of spectral parameters (TSSPs) that model the temporal evolution of the spectral envelope frame-to-frame. Those sequences are then filtered either to make them more robust to environmental conditions or to compute differential parameters (dynamic features) which enhance discrimination. In this paper, we apply frequency analysis to TSSPs in order to provide an interpretation framework for the various types of parameter filters used so far. Thus, the analysis of the average long-term spectrum of the successfully filtered sequences reveals a combined effect of equalization and band selection that provides insights into TSSP filtering. Also, we show in the paper that, when supplementary differential parameters are not used, the recognition rate can be improved even for clean speech, just by properly filtering the TSSPs. To support this claim, a number of experimental results are presented, both using whole-word and subword based models. The empirically optimum filters attenuate the low-pass band and emphasize a higher band so that the peak of the average long-term spectrum of the output of these filters lies at around the average syllable rate of the employed database (˜3 Hz).

CitacióNadeu, C., Paches, P., Biing-Hwang, J. Filtering the time sequences of spectral parameters for speech recognition. "Speech communication", Setembre 1997, vol. 22, núm. 4, p. 315-332.

URIhttp://hdl.handle.net/2117/97902

DOI10.1016/S0167-6393(97)00030-7

ISSN0167-6393

Versió de l'editorhttp://www.sciencedirect.com.recursos.biblioteca.upc.edu/science/article/pii/S0167639397000307

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Filtering the t ... for speech recognition.pdf		262,2Kb	PDF	Accés restringit

UPCommons. Portal del coneixement obert de la UPC

Filtering the time sequences of spectral parameters for speech recognition

Visualitza/Obre

Explora