The TALP-UPC phrase-based translation systems for WMT12: morphology simplification and domain adaptation

View/Open
Document typeConference lecture
Defense date2012
Rights accessOpen Access
Abstract
This paper describes the UPC participation in
the WMT 12 evaluation campaign. All sys-
tems presented are based on standard phrase-
based Moses systems. Variations adopted sev-
eral improvement techniques such as mor-
phology simplification and generation and do-
main adaptation. The morphology simpli-
fication overcomes the data sparsity prob-
lem when translating into morphologically-
rich languages such as Spanish by translat-
ing first to a morphology-simplified language
and secondly leave the morphology gener-
ation to an independent classification task.
The domain adaptation approach improves the
SMT system by adding new translation units
learned from MT-output and reference align-
ment. Results depict an improvement on TER,
METEOR, NIST and BLEU scores compared
to our baseline system, obtaining on the of-
ficial test set more benefits from the domain
adaptation approach than from the morpho-
logical generalization method.
CitationFormiga, L. [et al.]. The TALP-UPC phrase-based translation systems for WMT12: morphology simplification and domain adaptation. A: Workshop on Statistical Machine Translation. "Proceedings of the Seventh Workshop on Statistical Machine Translation : Montréal, Canada, June 7-8, 2012". Montreal, Quebec: 2012, p. 275-282.
Publisher versionhttp://aclweb.org/anthology-new/W/W12/W12-3133.pdf
Files | Description | Size | Format | View |
---|---|---|---|---|
W12-3133.pdf | 526,9Kb | View/Open |
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 3.0 Spain