Mostrar el registro sencillo del ítem

dc.contributor.authorHenriquez, Carlos A
dc.contributor.authorRuiz Costa-Jussà, Marta
dc.contributor.authorDaudaravicius, Vidas
dc.contributor.authorBanchs, Rafael E.
dc.contributor.authorMariño, José B.
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2017-03-14T19:12:02Z
dc.date.available2017-03-14T19:12:02Z
dc.date.issued2010
dc.identifier.citationHenriquez, C., Ruiz, M., Daudaravicius, V., Banchs, R., Mariño, J. UPC-BMIC-VDU system description for the IWSLT 2010: testing several collocation segmentations in a phrase-based SMT system. A: International Workshop on Spoken Language Translation. "Proceedings of IWSLT 2010, Paris, France". 2010, p. 189-195.
dc.identifier.urihttp://hdl.handle.net/2117/102470
dc.description.abstractThis paper describes the UPC-BMIC-VMU participation in the IWSLT 2010 evaluation campaign. The SMT system is a standard phrase-based enriched with novel segmentations. These novel segmentations are computed using statistical measures such as Log-likelihood, T-score, Chi-squared, Dice, Mutual Information or Gravity-Counts. The analysis of translation results allows to divide measures into three groups. First, Log-likelihood, Chi-squared and T-score tend to combine high frequency words and collocation segments are very short. They improve the SMT system by adding new translation units. Second, Mutual Information and Dice tend to combine low frequency words and collocation segments are short. They improve the SMT system by smoothing the translation units. And third, Gravity- Counts tends to combine high and low frequency words and collocation segments are long. However, in this case, the SMT system is not improved. Thus, the road-map for translation system improvement is to introduce new phrases with either low frequency or high frequency words. It is hard to introduce new phrases with low and high frequency words in order to improve translation quality. Experimental results are reported in the Frenchto- English IWSLT 2010 evaluation where our system was ranked 3rd out of nine systems.
dc.format.extent7 p.
dc.language.isoeng
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica
dc.subject.lcshMachine translation
dc.subject.otherMachine translation
dc.titleUPC-BMIC-VDU system description for the IWSLT 2010: testing several collocation segmentations in a phrase-based SMT system
dc.typeConference report
dc.subject.lemacTraducció automàtica
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.relation.publisherversionhttp://www.isca-speech.org/archive/iwslt_10/slta_189.html
dc.rights.accessOpen Access
drac.iddocument19723176
dc.description.versionPostprint (published version)
upcommons.citation.authorHenriquez, C., Ruiz, M., Daudaravicius, V., Banchs, R., Mariño, J.
upcommons.citation.contributorInternational Workshop on Spoken Language Translation
upcommons.citation.publishedtrue
upcommons.citation.publicationNameProceedings of IWSLT 2010, Paris, France
upcommons.citation.startingPage189
upcommons.citation.endingPage195


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Salvo que se indique lo contrario, los contenidos de esta obra estan sujetos a la licencia de Creative Commons: Reconocimiento-NoComercial-SinObraDerivada 3.0 España