The TALP-UPC phrase-based translation systems for WMT12: morphology simplification and domain adaptation
Document typeConference lecture
Rights accessOpen Access
This paper describes the UPC participation in the WMT 12 evaluation campaign. All sys- tems presented are based on standard phrase- based Moses systems. Variations adopted sev- eral improvement techniques such as mor- phology simplification and generation and do- main adaptation. The morphology simpli- fication overcomes the data sparsity prob- lem when translating into morphologically- rich languages such as Spanish by translat- ing first to a morphology-simplified language and secondly leave the morphology gener- ation to an independent classification task. The domain adaptation approach improves the SMT system by adding new translation units learned from MT-output and reference align- ment. Results depict an improvement on TER, METEOR, NIST and BLEU scores compared to our baseline system, obtaining on the of- ficial test set more benefits from the domain adaptation approach than from the morpho- logical generalization method.
CitationFormiga, L. [et al.]. The TALP-UPC phrase-based translation systems for WMT12: morphology simplification and domain adaptation. A: Workshop on Statistical Machine Translation. "Proceedings of the Seventh Workshop on Statistical Machine Translation : Montréal, Canada, June 7-8, 2012". Montreal, Quebec: 2012, p. 275-282.