A Deep source-context feature for lexical selection in statistical machine translation

Gupta, Parth; Ruiz Costa-Jussà, Marta; Rosso, Paolo; Banchs Martínez, Rafael Enrique

doi:10.1016/j.patrec.2016.02.014

Visualitza/Obre

mt-deep-main-letters-paper.pdf (358,8Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Gupta, Parth

Ruiz Costa-Jussà, Marta

Rosso, Paolo

Banchs Martínez, Rafael Enrique

Tipus de documentArticle

Data publicació2016-05-01

EditorElsevier

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

This paper presents a methodology to address lexical disambiguation in a standard phrase-based statistical machine translation system. Similarity among source contexts is used to select appropriate translation units. The information is introduced as a novel feature of the phrase-based model and it is used to select the translation units extracted from the training sentence more similar to the sentence to translate. The similarity is computed through a deep autoencoder representation, which allows to obtain effective low-dimensional embedding of data and statistically significant BLEU score improvements on two different tasks (English-to-Spanish and English-to-Hindi). (C) 2016 Elsevier B.V. All rights reserved.

CitacióGupta, P., Ruiz, M., Rosso, P., Banchs, R. A Deep source-context feature for lexical selection in statistical machine translation. "Pattern recognition letters", 1 Maig 2016, vol. 75, p. 24-29.

URIhttp://hdl.handle.net/2117/102922

DOI10.1016/j.patrec.2016.02.014

ISSN0167-8655

Versió de l'editorhttp://www.sciencedirect.com/science/article/pii/S0167865516000738

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
mt-deep-main-letters-paper.pdf		358,8Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

A Deep source-context feature for lexical selection in statistical machine translation

Visualitza/Obre

Explora