Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases
Tipus de documentText en actes de congrés
Condicions d'accésAccés obert
Under the SpeechDat specifications, the Spanish member of SpeechDat consortium has recorded a Catalan database that includes one thousand speakers. This communication describes some experimental work that has been carried out using both the Spanish and the Catalan speech material. A speech recognition system has been trained for the Spanish language using a selection of the phonetically balanced utterances from the 4500 SpeechDat training sessions. Utterances with mispronounced or incomplete words and with intermittent noise were discarded. A set of 26 allophones was selected to account for the Spanish sounds and clustered demiphones have been used as context dependent sub-lexical units. Following the same methodology, a recognition system was trained from the Catalan SpeechDat database. Catalan sounds were described with 32 allophones. Additionally, a bilingual recognition system was built for both the Spanish and Catalan languages. By means of clustering techniques, the suitable set of allophones to cover simultaneously both languages was determined. Thus, 33 allophones were selected. The training material was built by the whole Catalan training material and the Spanish material coming from the Eastern region of Spain (the region where Catalan is spoken). The performance of the Spanish, Catalan and bilingual systems were assessed under the same framework. The Spanish system exhibits a significantly better performance than the rest of systems due to its better training. The bilingual system provides an equivalent performance to that afforded by both language specific systems trained with the Eastern Spanish material or the Catalan SpeechDat corpus.
CitacióMariño, J.B., Padrell, J., Moreno, A., Nadeu, C. Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases. A: Workshop on Speech Recognition Based on Very Large Telephone Speech Databases. "XLDB- Very Large Telephone Speech Databases: Proceedings". Atenas: C. Draxler, 2000, p. 57-61.