Mostra el registre d'ítem simple
Syntax-based reordering for statistical machine translation
dc.contributor.author | Khalilov, Maxim |
dc.contributor.author | Rodríguez Fonollosa, José Adrián |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions |
dc.date.accessioned | 2011-07-13T16:57:01Z |
dc.date.available | 2011-07-13T16:57:01Z |
dc.date.created | 2011-10 |
dc.date.issued | 2011-10 |
dc.identifier.citation | Khalilov, M.; Fonollosa, José A. R. Syntax-based reordering for statistical machine translation. "Computer speech and language", Octubre 2011, vol. 25, núm. 4, p. 761-788. |
dc.identifier.issn | 0885-2308 |
dc.identifier.uri | http://hdl.handle.net/2117/12964 |
dc.description.abstract | In this paper, we develop an approach called syntax-based reordering (SBR) to handling the fundamental problem of word ordering for statistical machine translation (SMT). We propose to alleviate the word order challenge including morpho-syntactical and statistical information in the context of a pre-translation reordering framework aimed at capturing short- and long-distance word distortion dependencies. We examine the proposed approach from the theoretical and experimental points of view discussing and analyzing its advantages and limitations in comparison with some of the state-of-the-art reordering methods. In the final part of the paper, we describe the results of applying the syntax-based model to translation tasks with a great need for reordering (Chinese-to-English and Arabic-to-English). The experiments are carried out on standard phrase-based and alternative N-gram-based SMT systems. We first investigate sparse training data scenarios, in which the translation and reordering models are trained on a sparse bilingual data, then scaling the method to a large training set and demonstrating that the improvement in terms of translation quality is maintained. |
dc.format.extent | 28 p. |
dc.language.iso | eng |
dc.publisher | Elsevier |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Spain |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural |
dc.subject.lcsh | Natural language processing |
dc.subject.lcsh | Computational linguistics |
dc.title | Syntax-based reordering for statistical machine translation |
dc.type | Article |
dc.subject.lemac | Lingüística computacional |
dc.subject.lemac | Tractament del llenguatge natural (Informàtica) |
dc.contributor.group | Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla |
dc.identifier.doi | 10.1016/j.csl.2011.01.001 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | http://www.sciencedirect.com/science/article/B6WCW-525YP4S-1/2/5ac1785c72af82c125d6d716f37f4fbf |
dc.rights.access | Restricted access - publisher's policy |
local.identifier.drac | 5799381 |
dc.description.version | Postprint (published version) |
local.citation.author | Khalilov, M.; Fonollosa, José A. R. |
local.citation.publicationName | Computer speech and language |
local.citation.volume | 25 |
local.citation.number | 4 |
local.citation.startingPage | 761 |
local.citation.endingPage | 788 |
Fitxers d'aquest items
Aquest ítem apareix a les col·leccions següents
-
Articles de revista [172]
-
Articles de revista [2.526]