Weights of the various components in a
standard Statistical Machine Translation
model are usually estimated via Minimum
Error Rate Training. With this, one finds
their optimum value on a development set with the expectation that these optimal
weights generalise well to other test sets. However, this is not always the case when domains differ. This work uses a perceptron algorithm to learn more robust weights to be used on out-of-domain corpora without the need for specialised data. For an Arabic-to-English translation system, the generalisation of weights represents an improvement of more than 2 points of BLEU with respect to the MERT baseline using the same information.
CitationEspaña-Bonet, C.; Màrquez, L. Robust estimation of feature weights in statistical machine translation. A: Annual Conference of the European Association for Machine Translation. "14th Annual Conference of the European Association for Machine Translation". Saint-Raphaël: 2010, p. 190-197.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: firstname.lastname@example.org