TTS evaluation campaign with a common spanish database

Sainz, Iñaki; Navas, Eva; Hernáez, Inma; Bonafonte Cávez, Antonio; Campillo, Francisco

dc.contributor.author	Sainz, Iñaki
dc.contributor.author	Navas, Eva
dc.contributor.author	Hernáez, Inma
dc.contributor.author	Bonafonte Cávez, Antonio
dc.contributor.author	Campillo, Francisco
dc.contributor.other	Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned	2010-10-11T07:58:56Z
dc.date.available	2010-10-11T07:58:56Z
dc.date.created	2010
dc.date.issued	2010
dc.identifier.citation	Sainz, I. [et al.]. TTS evaluation campaign with a common spanish database. A: International Conference on Language Resources and Evaluation. "Seventh Int. Conf. on Language Resources and Evaluation (LREC)". Valleta: 2010, p. 2155-2160.
dc.identifier.isbn	2-9517408-6-7
dc.identifier.uri	http://hdl.handle.net/2117/9608
dc.description.abstract	This paper describes the first TTS evaluation campaign designed for Spanish. Seven research institutions took part in the evaluation campaign and developed a voice from a common speech database provided by the organisation. Each participating team had a period of seven weeks to generate a voice. Next, a set of sentences were released and each team had to synthesise them within a week period. Finally, some of the synthesised test audio files were subjectively evaluated via an online test according to the following criteria: similarity to the original voice, naturalness and intelligibility. Box-plots, Wilcoxon tests and WER have been generated in order to analyse the results. Two main conclusions can be drawn: On the one hand, there is considerable margin for improvement to reach the quality level of the natural voice. On the other hand, two systems get significantly better results than the rest: one is based on statistical parametric synthesis and the other one is a concatenative system that makes use of a sinusoidal model to modify both prosody and smooth spectral joints. Therefore, it seems that some kind of spectral control is needed when building voices with a medium size database for unrestricted domains.
dc.format.extent	6 p.
dc.language.iso	eng
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subject	Àrees temàtiques de la UPC::Enginyeria de la telecomunicació
dc.subject.lcsh	Text-to-speech software
dc.subject.lcsh	Signal theory (Telecommunication)
dc.title	TTS evaluation campaign with a common spanish database
dc.type	Conference report
dc.subject.lemac	Senyal, Teoria del (Telecomunicació)
dc.contributor.group	Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.relation.publisherversion	http://www.lrec-conf.org/proceedings/lrec2010/pdf/456_Paper.pdf
dc.rights.access	Open Access
local.identifier.drac	3265447
dc.description.version	Postprint (published version)
local.citation.author	Sainz, I.; Navas, E.; Hernáez, I.; Bonafonte, A.; Campillo, F.
local.citation.contributor	International Conference on Language Resources and Evaluation
local.citation.pubplace	Valleta
local.citation.publicationName	Seventh Int. Conf. on Language Resources and Evaluation (LREC)
local.citation.startingPage	2155
local.citation.endingPage	2160

Fitxers d'aquest items

Nom:: TTSevaluation.pdf
Mida:: 461,5Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [437]
Ponències/Comunicacions de congressos [3.327]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

TTS evaluation campaign with a common spanish database

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora