Show simple item record

dc.contributor.authorEspaña Bonet, Cristina
dc.contributor.authorGiménez, Jesús
dc.contributor.authorMàrquez Villodre, Lluís
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned2016-05-11T12:34:17Z
dc.date.available2016-05-11T12:34:17Z
dc.date.issued2009-01
dc.identifier.citationEspaña-Bonet, C., Giménez, J., Márquez, L. "Discriminative learning within Arabic statistical machine translation". 2009.
dc.identifier.urihttp://hdl.handle.net/2117/86942
dc.description.abstractWritten Arabic is a especially ambiguous due to the lack of diacritisation of texts, and this makes the translation harder for automatic systems that do not take into account the context of phrases. Here, we use a standard Phrase-Based Statistical Machine Translation architecture to build an Arabic-to-English translation system, but we extend it by incorporating a local discriminative phrase selection model which addresses this semantic ambiguity. Local classifiers are trained using both linguistic information and context to translate a phrase, and this significantly increases the accuracy in phrase selection with respect to the most frequent translation traditionally considered. These classifiers are integrated into the translation system so that the global task gets benefits from the discriminative learning. As a result, we obtain improvements in the full translation of Arabic documents at the lexical, syntactic and semantic levels as measured by an heterogeneous set of automatic metrics.
dc.format.extent15 p.
dc.language.isoeng
dc.relation.ispartofseriesLSI-09-3-R
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial
dc.subject.otherStatistical machine translation
dc.subject.otherDiscriminative learning
dc.subject.otherArabic
dc.subject.otherEnglish
dc.titleDiscriminative learning within Arabic statistical machine translation
dc.typeExternal research report
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.rights.accessOpen Access
local.identifier.drac604039
dc.description.versionPostprint (published version)
local.citation.authorEspaña-Bonet, C.; Giménez, J.; Márquez, L.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record