Mostra el registre d'ítem simple

dc.contributor.authorJauk, Igor
dc.contributor.authorBonafonte Cávez, Antonio
dc.contributor.authorLópez Otero, Paula
dc.contributor.authorDocio Fernández, Laura
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2016-01-18T14:34:59Z
dc.date.issued2015
dc.identifier.citationJauk, I., Bonafonte, A., López-Otero, P., Docio-Fernández, L. Creating expressive synthetic voices by unsupervised clustering of audiobooks. A: Annual Conference of the International Speech Communication Association. "INTERSPEECH 2015: 16th Annual Conference of the International Speech Communication Association: Dresden, Germany: September 6-10, 2015". Dresden: International Speech Communication Association (ISCA), 2015, p. 3380-3384.
dc.identifier.isbn1990-9770
dc.identifier.urihttp://hdl.handle.net/2117/81613
dc.description.abstractIn this work we design an approach for automatic feature selection and voice creation for expressive synthesis. Our approach is guided by two main goals: (1) increasing the flexibility of expressive voice creation and (2) overcoming the limitations of speaking styles in expressive synthesis. We define a novel set of features, combining traditionally used prosodic features with spectral features and proposing the use of iVectors. With these features we perform unsupervised clustering of an audiobook excerpt and, from these clusters, we create synthetic voices using the SAT technique. To evaluate the clustering performance we propose an objective evaluation of the unsupervised clustering results technique based on perplexity reduction. This objective evaluation indicates that both prosodic and spectral features contribute to separate speaking styles and emotions, achieving the best results when including iVectors in the feature set, leading to a perplexity reduction of the expressions and audiobook characters by factors 14 and 2, respectively. We also designed a novel subjective evaluation method where the participants have to edit a small excerpt of an audiobook using synthetic voices created from clusters. The results suggest that our feature set is effective in the task of expressiveness and character detection.
dc.format.extent5 p.
dc.language.isoeng
dc.publisherInternational Speech Communication Association (ISCA)
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcshAutomatic speech recognition
dc.subject.lcshNatural language processing (Computer science)
dc.subject.otherExpressive speech synthesis
dc.subject.otherAutomatic voice creation
dc.subject.otherExpressive speech synthesis evaluation
dc.titleCreating expressive synthetic voices by unsupervised clustering of audiobooks
dc.typeConference lecture
dc.subject.lemacReconeixement automàtic de la parla
dc.subject.lemacTractament del llenguatge natural (Informàtica)
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://www.isca-speech.org/archive/interspeech_2015/i15_3380.html
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac16678708
dc.description.versionPostprint (published version)
dc.date.lift10000-01-01
local.citation.authorJauk, I.; Bonafonte, A.; López-Otero, P.; Docio-Fernández, L.
local.citation.contributorAnnual Conference of the International Speech Communication Association
local.citation.pubplaceDresden
local.citation.publicationNameINTERSPEECH 2015: 16th Annual Conference of the International Speech Communication Association: Dresden, Germany: September 6-10, 2015
local.citation.startingPage3380
local.citation.endingPage3384


Fitxers d'aquest items

Imatge en miniatura

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple