Show simple item record

dc.contributor.authorIndia Massana, Miquel Àngel
dc.contributor.authorRodríguez Fonollosa, José Adrián
dc.contributor.authorHernando Pericás, Francisco Javier
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.identifier.citationIndia, M., Fonollosa, José A. R., Hernando, J. LSTM neural network-based speaker segmentation using acoustic and language modelling. A: Annual Conference of the International Speech Communication Association. "INTERSPEECH 2017: 20-24 August 2017: Stockholm". Stockholm: International Speech Communication Association (ISCA), 2017, p. 2834-2838.
dc.description.abstractThis paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different Joint Factor Analysis (JFA) acoustic approaches: i-vectors and speaker factors. Both of them are compared with a baseline algorithm that uses cosine distance to detect speaker turn changes. LSTM neural networks with both linguistic and acoustic features have been able to produce a robust speaker segmentation. The experimental results show that our proposal clearly outperforms the baseline system.
dc.format.extent5 p.
dc.publisherInternational Speech Communication Association (ISCA)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcshAutomatic speech recognition
dc.subject.lcshNeural networks (Neurobiology)
dc.subject.otherSpeaker segmentation
dc.subject.otherNeural language modelling
dc.subject.otherSpeaker factors
dc.subject.otherLSTM neural networks
dc.titleLSTM neural network-based speaker segmentation using acoustic and language modelling
dc.typeConference lecture
dc.subject.lemacReconeixement automàtic de la parla
dc.subject.lemacXarxes neuronals (Neurobiologia)
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.description.peerreviewedPeer Reviewed
dc.rights.accessOpen Access
dc.description.versionPostprint (published version)
upcommons.citation.authorIndia, M.; Fonollosa, José A. R.; Hernando, J.
upcommons.citation.contributorAnnual Conference of the International Speech Communication Association
upcommons.citation.publicationNameINTERSPEECH 2017: 20-24 August 2017: Stockholm

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder