Speaker recognition using frequency filtered spectral energies
Tipus de documentText en actes de congrés
EditorFONDAZIONE UGO BORDONI
Condicions d'accésAccés obert
The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a simple first or second order FIR filter have proved to be an efficient speech representation in terms of both speech recognition rate and computational load. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using melcepstrum have been obtained on the TIMIT database, especially when white noise was added. On the other hand, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering has been explored for speaker verification. The combination of hybrid spectral analysis and frequency filtering, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POLYCOST database.
CitacióHernando, J. Speaker recognition using frequency filtered spectral energies. A: COST 250 FINAL WORKSHOP ON SPEAKER RECOGNITION IN TELEPHONY. "COST 250 FINAL WORKSHOP ON SPEAKER RECOGNITION IN TELEPHONY". Roma: FONDAZIONE UGO BORDONI, 1999, p. 72-77.