Correcting input noise in SMT as a char-based translation problem
Document typeExternal research report
Rights accessOpen Access
Misspelled words have a direct impact on the final quality obtained by Statistical Machine Translation (SMT) systems as the input becomes noisy and unpredictable. This paper presents some improvement strategies for translating real-life noisy input. The proposed strategies are based on a preprocessing step consisting in a character-based translator.
CitationFormiga, L.; Fonollosa, José A. R. "Correcting input noise in SMT as a char-based translation problem". 2012.
Is part ofTALP-2012-OCT-31
URL other repositoryhttp://nlp.lsi.upc.edu/publications/papers/misspelling_techrep_oct2012.pdf