Show simple item record

dc.contributor.authorPascual, Santiago
dc.contributor.authorBonafonte Cávez, Antonio
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2018-05-23T16:17:18Z
dc.date.issued2016
dc.identifier.citationPascual, S., Bonafonte, A. Multi-output RNN-LSTM for multiple speaker speech synthesis and adaptation. A: European Signal Processing Conference. "2016 24th European Signal Processing Conference (EUSIPCO): took place 28 August-2 September 2016 in Budapest, Hungary". Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 2325-2329.
dc.identifier.isbn978-1-5090-1891-8
dc.identifier.urihttp://hdl.handle.net/2117/117430
dc.description.abstractDeep Learning has been applied successfully to speech processing. In this paper we propose an architecture for speech synthesis using multiple speakers. Some hidden layers are shared by all the speakers, while there is a specific output layer for each speaker. Objective and perceptual experiments prove that this scheme produces much better results in comparison with single speaker model. Moreover, we also tackle the problem of speaker adaptation by adding a new output branch to the model and successfully training it without the need of modifying the base optimized model. This fine tuning method achieves better results than training the new speaker from scratch with its own model.
dc.format.extent5 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal
dc.subject.lcshSignal processing
dc.subject.otherSpeech synthesis
dc.subject.otherLearning (artificial intelligence)
dc.subject.otherRecurrent neural nets
dc.subject.otherSpeaker recognition
dc.titleMulti-output RNN-LSTM for multiple speaker speech synthesis and adaptation
dc.typeConference report
dc.subject.lemacTractament del senyal
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.identifier.doi10.1109/EUSIPCO.2016.7760664
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/7760664/
dc.rights.accessRestricted access - publisher's policy
drac.iddocument21092811
dc.description.versionPostprint (published version)
dc.date.lift10000-01-01
upcommons.citation.authorPascual, S., Bonafonte, A.
upcommons.citation.contributorEuropean Signal Processing Conference
upcommons.citation.publishedtrue
upcommons.citation.publicationName2016 24th European Signal Processing Conference (EUSIPCO): took place 28 August-2 September 2016 in Budapest, Hungary
upcommons.citation.startingPage2325
upcommons.citation.endingPage2329


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder