Mostra el registre d'ítem simple

dc.contributor.authorArmengol Estapé, Jordi
dc.contributor.authorRuiz Costa-Jussà, Marta
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2021-06-17T08:42:28Z
dc.date.available2021-06-17T08:42:28Z
dc.date.issued2021-05-18
dc.identifier.citationArmengol, J.; Costa-jussà, M.R. Semantic and syntactic information for neural machine translation: Injecting features to the transformer. "Machine translation", 18 Maig 2021, vol. 35, p. 3-17.
dc.identifier.issn0922-6567
dc.identifier.urihttp://hdl.handle.net/2117/347441
dc.description.abstractIntroducing factors such as linguistic features has long been proposed in machine translation to improve the quality of translations. More recently, factored machine translation has proven to still be useful in the case of sequence-to-sequence systems. In this work, we investigate whether this gains hold in the case of the state-of-the-art architecture in neural machine translation, the Transformer, instead of recurrent architectures. We propose a new model, the Factored Transformer, to introduce an arbitrary number of word features in the source sequence in an attentional system. Specifically, we suggest two variants depending on the level at which the features are injected. Moreover, we suggest two combination mechanisms for the word features and words themselves. We experiment both with classical linguistic features and semantic features extracted from a linked data database, and with two low-resource datasets. With the best-found configuration, we show improvements of 0.8 BLEU over the baseline Transformer in the IWSLT German-to-English task. Moreover, we experiment with the more challenging FLoRes English-to-Nepali benchmark, which includes both low-resource and very distant languages, and obtain an improvement of 1.2 BLEU. These improvements are achieved with linguistic and not with semantic information.
dc.description.sponsorshipThis work is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 947657).
dc.format.extent15 p.
dc.language.isoeng
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcshMachine translating
dc.subject.lcshComputational linguistics
dc.subject.otherTransformer
dc.subject.otherFactored neural machine translation
dc.subject.otherLinguistic features
dc.subject.otherSemantic features
dc.titleSemantic and syntactic information for neural machine translation: Injecting features to the transformer
dc.typeArticle
dc.subject.lemacTraducció automàtica
dc.subject.lemacLingüística computacional
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.identifier.doi10.1007/s10590-021-09264-2
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://link.springer.com/article/10.1007/s10590-021-09264-2
dc.rights.accessOpen Access
local.identifier.drac31831082
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/947657/EU/Lifelong UNiversal lAnguage Representation/LUNAR
local.citation.authorArmengol, J.; Costa-jussà, Marta R.
local.citation.publicationNameMachine translation
local.citation.volume35
local.citation.startingPage3
local.citation.endingPage17


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple