Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
Speech dynamic features are routinely used in current speech recognition systems in combination with short-term (static) spectral features. Although many existing speech recognition systems do not weight both kinds of features, it seems convenient to use some weighting in order to increase the recognition accuracy of the system. In the cases that this weighting is performed, it is manually tuned or it consists simply in compensating the variances. The aim of this paper is to propose a method to automatically estimate an optimum state-dependent stream weighting in a continuous density hidden Markov model (CDHMM) recognition system by means of a maximum-likelihood based training algorithm. Unlike other works, it is shown that simple constraints on the new weighting parameters permit to apply the maximum-likelihood criterion to this problem. Experimental results in speaker independent digit recognition show an important increase of recognition accuracy.
CitationHernando, J. Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition. A: IEEE International Conference on Acoustics, Speech, and Signal Processing. "ICASSP 1997: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing: April 21-24, 1997: Munich, Germany". Munich, Bravaria: Institute of Electrical and Electronics Engineers (IEEE), 1997, p. 1267-1270.