Automatic speech recognition with deep neural networks for impaired speech

España-i-Bonet, Cristina; Rodríguez Fonollosa, José Adrián

doi:10.1007/978-3-319-49169-1_10

dc.contributor.author	España-i-Bonet, Cristina
dc.contributor.author	Rodríguez Fonollosa, José Adrián
dc.contributor.other	Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned	2017-04-27T21:35:58Z
dc.date.available	2017-04-27T21:35:58Z
dc.date.issued	2016
dc.identifier.citation	España-i-Bonet, C., Fonollosa, J. A. R. Automatic speech recognition with deep neural networks for impaired speech. A: International Conference on Advances in Speech and Language Technologies for Iberian Languages. "Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016: Lisbon, Portugal, November 23-25, 2016: proceedings". Lisbon: Springer, 2016, p. 97-107.
dc.identifier.isbn	978-3-319-49169-1
dc.identifier.uri	http://hdl.handle.net/2117/103823
dc.description	The final publication is available at https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10
dc.description.abstract	Automatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. In this work we train different architectures on a database of dysarthric speech. A comparison between architectures shows that, even with a small database, hybrid DNN-HMM models outperform classical GMM-HMM according to word error rate measures. A DNN is able to improve the recognition word error rate a 13% for subjects with dysarthria with respect to the best classical architecture. This improvement is higher than the one given by other deep neural networks such as CNNs, TDNNs and LSTMs. All the experiments have been done with the Kaldi toolkit for speech recognition for which we have adapted several recipes to deal with dysarthric speech and work on the TORGO database. These recipes are publicly available.
dc.format.extent	11 p.
dc.language.iso	eng
dc.publisher	Springer
dc.subject	Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcsh	Automatic Speech Recognition
dc.subject.other	Database systems
dc.subject.other	Network architecture
dc.subject.other	Neural networks – Speech
dc.subject.other	Automatic speech recognition
dc.subject.other	Deep learning
dc.subject.other	Deep neural networks
dc.subject.other	Dysarthria
dc.subject.other	Human performance
dc.subject.other	Kaldi
dc.subject.other	Speaker adaptation
dc.subject.other	Word error rate
dc.title	Automatic speech recognition with deep neural networks for impaired speech
dc.type	Conference report
dc.subject.lemac	Reconeixement automàtic de la parla
dc.contributor.group	Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.identifier.doi	10.1007/978-3-319-49169-1_10
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10
dc.rights.access	Open Access
local.identifier.drac	19355194
dc.description.version	Postprint (author's final draft)
dc.relation.projectid	info:eu-repo/grantAgreement/MINECO//IPT-2012-0914-300000/ES/Programa segunda voz/
dc.relation.projectid	info:eu-repo/grantAgreement/MINECO//TEC2015-69266-P/ES/TECNOLOGIAS DE APRENDIZAJE PROFUNDO APLICADAS AL PROCESADO DE VOZ Y AUDIO/
local.citation.author	España-i-Bonet, C.; Fonollosa, José A. R.
local.citation.contributor	International Conference on Advances in Speech and Language Technologies for Iberian Languages
local.citation.pubplace	Lisbon
local.citation.publicationName	Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016: Lisbon, Portugal, November 23-25, 2016: proceedings
local.citation.startingPage	97
local.citation.endingPage	107

Fitxers d'aquest items

Nom:: LNAI16.pdf
Mida:: 181,3Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [437]
Ponències/Comunicacions de congressos [3.327]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Automatic speech recognition with deep neural networks for impaired speech

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora