Mostra el registre d'ítem simple

dc.contributor.authorIndia Massana, Miquel Àngel
dc.contributor.authorRodríguez Fonollosa, José Adrián
dc.contributor.authorHernando Pericás, Francisco Javier
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2018-01-19T13:10:29Z
dc.date.available2018-01-19T13:10:29Z
dc.date.issued2017
dc.identifier.citationIndia, M., Fonollosa, José A. R., Hernando, J. LSTM neural network-based speaker segmentation using acoustic and language modelling. A: Annual Conference of the International Speech Communication Association. "INTERSPEECH 2017: 20-24 August 2017: Stockholm". Stockholm: International Speech Communication Association (ISCA), 2017, p. 2834-2838.
dc.identifier.isbn1990-9772
dc.identifier.urihttp://hdl.handle.net/2117/112988
dc.description.abstractThis paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different Joint Factor Analysis (JFA) acoustic approaches: i-vectors and speaker factors. Both of them are compared with a baseline algorithm that uses cosine distance to detect speaker turn changes. LSTM neural networks with both linguistic and acoustic features have been able to produce a robust speaker segmentation. The experimental results show that our proposal clearly outperforms the baseline system.
dc.format.extent5 p.
dc.language.isoeng
dc.publisherInternational Speech Communication Association (ISCA)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcshAutomatic speech recognition
dc.subject.lcshNeural networks (Neurobiology)
dc.subject.otherSpeaker segmentation
dc.subject.otherNeural language modelling
dc.subject.otherI-vectors
dc.subject.otherSpeaker factors
dc.subject.otherLSTM neural networks
dc.titleLSTM neural network-based speaker segmentation using acoustic and language modelling
dc.typeConference lecture
dc.subject.lemacReconeixement automàtic de la parla
dc.subject.lemacXarxes neuronals (Neurobiologia)
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.identifier.doi10.21437/Interspeech.2017
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://www.isca-speech.org/archive/Interspeech_2017/pdfs/0407.PDF
dc.rights.accessOpen Access
local.identifier.drac21716191
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/115902/EU/Remote Assessment of Disease and Relapse in Central Nervous System Disorders/RADAR-CNS
local.citation.authorIndia, M.; Fonollosa, José A. R.; Hernando, J.
local.citation.contributorAnnual Conference of the International Speech Communication Association
local.citation.pubplaceStockholm
local.citation.publicationNameINTERSPEECH 2017: 20-24 August 2017: Stockholm
local.citation.startingPage2834
local.citation.endingPage2838


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple