Transfer Learning with Shallow Decoders: BSC at WMT2021’s Multilingual Low-Resource Translation for Indo-European Languages Shared Task
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/366266
Tipus de documentComunicació de congrés
Data publicació2021
EditorAssociation for Computational Linguistics
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement 3.0 Espanya
Abstract
This paper describes the participation of the BSC team in the WMT2021{'}s Multilingual Low-Resource Translation for Indo-European Languages Shared Task. The system aims to solve the Subtask 2: Wikipedia cultural heritage articles, which involves translation in four Romance languages: Catalan, Italian, Occitan and Romanian. The submitted system is a multilingual semi-supervised machine translation model. It is based on a pre-trained language model, namely XLM-RoBERTa, that is later fine-tuned with parallel data obtained mostly from OPUS. Unlike other works, we only use XLM to initialize the encoder and randomly initialize a shallow decoder. The reported results are robust and perform well for all tested languages.
CitacióKharitonova, K. [et al.]. Transfer Learning with Shallow Decoders: BSC at WMT2021's Multilingual Low-Resource Translation for Indo-European Languages Shared Task. A: Conference on Machine Translation (WMT). "Proceedings of the Sixth Conference on Machine Translation: online, Nov 10-11. 2021". Association for Computational Linguistics, 2021, p. 362-367.
Versió de l'editorhttps://aclanthology.org/2021.wmt-1.43
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
2021.wmt-1.43.pdf | 210,4Kb | Visualitza/Obre |