Linguistic-family-specific encoders and decoders for multilingual machine translation

Yang, Yining

Visualitza/Obre

Final-Report.pdf (687,6Kb)

ANNEXES-CODES.zip (2,989Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Yang, Yining

Tutor / directorRuiz Costa-Jussà, Marta

; Escolano Peinado, Carlos

Tipus de documentProjecte Final de Màster Oficial

Data2022-02-01

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

Multilingual Machine Translation has been approached from different perspectives including the shared and the language-specific encoders-decoders. The shared one uses a single encoder and decoder for all languages but the language-specific encoders-decoders allocate encoder and decoder for each language. Both perspectives have their benefits and drawbacks on translation quality and resource consumption aspect. To find a balance between these two factors, this project explores a new approach that is to share the encoders and decoders for language families. The new model was trained and tested on the TED2020 dataset with 21 chosen languages to form 4 language families. Comparison between the all-language shared baseline and our model shows a great improvement in BLEU score which can from 3 points to a maximum of 10 points according to the family pairs. The new model also has a good performance of zero-shot translation, which outperforms that of the baseline model and the improvement follows the rule of growth concluded from the model training.

MatèriesMachine translating, Detectors, Traducció automàtica, Detectors

TitulacióMÀSTER UNIVERSITARI EN ENGINYERIA DE TELECOMUNICACIÓ (Pla 2013)

URIhttp://hdl.handle.net/2117/368110

Col·leccions

Màsters oficials - Master's degree in Telecommunications Engineering (MET) [393]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Final-Report.pdf		687,6Kb	PDF	Visualitza/Obre
ANNEXES-CODES.zip		2,989Mb	application/zip	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Linguistic-family-specific encoders and decoders for multilingual machine translation

Visualitza/Obre

Explora