Efficient transformers for direct speech translation

Alastruey Lasheras, Belén

Visualitza/Obre

memoria.pdf (1,822Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Alastruey Lasheras, Belén

Tutor / directorRuiz Costa-Jussà, Marta

; Gallego Olsina, Gerard Ion

Tipus de documentTreball Final de Grau

Data2021-07

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

In this thesis, we propose a new approach for Speech-to-Text translation, where thanks to an efficient Transformer we can work with a spectrogram without having to use convolutional layers before the Transformer. This allows the encoder to learn directly from the spectrogram and no information is lost, which we believe could be profitable. We have created an encoder-decoder model, where the encoder is an efficient Transformer -the Longformer- and the decoder is a traditional Transformer decoder. Firstly we trained our model for an Automatic Speech Recognition (ASR) task, and then for Speech Translation using the ASR pre-trained encoder. Our results are close to the ones obtained with convolutional layers and a regular Transformer, showing less than a 10% relative reduction of the performance, meaning that this is a great starting point for a promising research path.

MatèriesArtificial intelligence, Intel·ligència artificial

TitulacióGRAU EN MATEMÀTIQUES (Pla 2009)

URIhttp://hdl.handle.net/2117/349294

Col·leccions

Facultat de Matemàtiques i Estadística - Grau en Matemàtiques (Pla 2009) [325]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
memoria.pdf		1,822Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Efficient transformers for direct speech translation

Visualitza/Obre

Explora