Mostra el registre d'ítem simple

dc.contributor.authorEspaña-i-Bonet, Cristina
dc.contributor.authorRodríguez Fonollosa, José Adrián
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2017-04-27T21:35:58Z
dc.date.available2017-04-27T21:35:58Z
dc.date.issued2016
dc.identifier.citationEspaña-i-Bonet, C., Fonollosa, J. A. R. Automatic speech recognition with deep neural networks for impaired speech. A: International Conference on Advances in Speech and Language Technologies for Iberian Languages. "Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016: Lisbon, Portugal, November 23-25, 2016: proceedings". Lisbon: Springer, 2016, p. 97-107.
dc.identifier.isbn978-3-319-49169-1
dc.identifier.urihttp://hdl.handle.net/2117/103823
dc.descriptionThe final publication is available at https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10
dc.description.abstractAutomatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. In this work we train different architectures on a database of dysarthric speech. A comparison between architectures shows that, even with a small database, hybrid DNN-HMM models outperform classical GMM-HMM according to word error rate measures. A DNN is able to improve the recognition word error rate a 13% for subjects with dysarthria with respect to the best classical architecture. This improvement is higher than the one given by other deep neural networks such as CNNs, TDNNs and LSTMs. All the experiments have been done with the Kaldi toolkit for speech recognition for which we have adapted several recipes to deal with dysarthric speech and work on the TORGO database. These recipes are publicly available.
dc.format.extent11 p.
dc.language.isoeng
dc.publisherSpringer
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcshAutomatic Speech Recognition
dc.subject.otherDatabase systems
dc.subject.otherNetwork architecture
dc.subject.otherNeural networks – Speech
dc.subject.otherAutomatic speech recognition
dc.subject.otherDeep learning
dc.subject.otherDeep neural networks
dc.subject.otherDysarthria
dc.subject.otherHuman performance
dc.subject.otherKaldi
dc.subject.otherSpeaker adaptation
dc.subject.otherWord error rate
dc.titleAutomatic speech recognition with deep neural networks for impaired speech
dc.typeConference report
dc.subject.lemacReconeixement automàtic de la parla
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.identifier.doi10.1007/978-3-319-49169-1_10
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10
dc.rights.accessOpen Access
local.identifier.drac19355194
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//IPT-2012-0914-300000/ES/Programa segunda voz/
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TEC2015-69266-P/ES/TECNOLOGIAS DE APRENDIZAJE PROFUNDO APLICADAS AL PROCESADO DE VOZ Y AUDIO/
local.citation.authorEspaña-i-Bonet, C.; Fonollosa, José A. R.
local.citation.contributorInternational Conference on Advances in Speech and Language Technologies for Iberian Languages
local.citation.pubplaceLisbon
local.citation.publicationNameAdvances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016: Lisbon, Portugal, November 23-25, 2016: proceedings
local.citation.startingPage97
local.citation.endingPage107


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple