Combining subword representations into word-level representations in the transformer architecture

Casas Manzanares, Noé; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián

Visualitza/Obre

2020.acl-srw.10(2).pdf (1013,Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Casas Manzanares, Noé

Ruiz Costa-Jussà, Marta

Rodríguez Fonollosa, José Adrián

Tipus de documentComunicació de congrés

Data publicació2020

EditorAssociation for Computational Linguistics

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

In Neural Machine Translation, using word-level tokens leads to degradation in translation quality. The dominant approaches use subword-level tokens, but this increases the length of the sequences and makes it difficult to profit from word-level information such as POS tags or semantic dependencies. We propose a modification to the Transformer model to combine subword-level representations into word-level ones in the first layers of the encoder, reducing the effective length of the sequences in the following layers and providing a natural point to incorporate extra word-level information. Our experiments show that this approach maintains the translation quality with respect to the normal Transformer model when no extra word-level information is injected and that it is superior to the currently dominant method for incorporating word-level source language information to models based on subword-level vocabularies.

CitacióCasas, N.; Costa-jussà, M.R.; Fonollosa, J.A.R. Combining subword representations into word-level representations in the transformer architecture. A: Annual Meeting of the Association for Computational Linguistics. "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop". Stroudsburg, PA: Association for Computational Linguistics, 2020, p. 66-71. ISBN 978-1-952148-03-3.

URIhttp://hdl.handle.net/2117/330587

ISBN978-1-952148-03-3

Versió de l'editorhttps://www.aclweb.org/anthology/2020.acl-srw.10/

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
2020.acl-srw.10(2).pdf		1013,Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Combining subword representations into word-level representations in the transformer architecture

Visualitza/Obre

Explora