VEU - Grup de Tractament de la Parla
L'àmbit de recerca del grup 'VEU' és el tractament de la parla. Investiguem tecnologies que permeten l'extracció d'informació que la veu conté: reconeixement del que es diu, l'idioma o el dialecte, característiques del parlant -qui és, la seva edat, el sexe, l'estat emocional-, la direcció del so. També treballem en la caracterització general de l'àudio, per determinar quan hi ha veu i quan hi ha altres esdeveniments acústics com música o sorolls diversos. Les tecnologies de la parla possibiliten generar veu -síntesis de veu- o modificar les seves
Collections in this community
-
Articles de revista [172]
-
Llibres [5]
-
Presentacions [2]
-
Reports de recerca [42]
Recent Submissions
-
Multiformer: a head-configurable transformer-based model for direct speech translation
(Association for Computational Linguistics, 2022)
Conference lecture
Open AccessTransformer-based models have been achieving state-of-the-art results in several fields of Natural Language Processing. However, its direct application to speech tasks is not trivial. The nature of this sequences carries ... -
Evaluating gender bias in speech translation
(European Language Resources Association, 2022)
Conference lecture
Open AccessThe scientific community is increasingly aware of the necessity to embrace pluralism and consistently represent major and minor social groups. Currently, there are no standard evaluation techniques for different types of ... -
SHAS: approaching optimal segmentation for end-to-end speech translation
(2022-02)
Research report
Open AccessSpeech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ... -
Language modelling for speaker diarization in telephonic interviews
(Elsevier, 2023-03)
Article
Open AccessThe aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data ... -
Attention weights in transformer NMT fail aligning words between sequences but largely explain model predictions
(Association for Computational Linguistics, 2021)
Conference lecture
Open AccessThis work proposes an extensive analysis of the Transformer architecture in the Neural Machine Translation (NMT) setting. Focusing on the encoder-decoder attention mechanism, we prove that attention weights systematically ... -
A genetic programming approach for economic forecasting with survey expectations
(Multidisciplinary Digital Publishing Institute, 2022-06-30)
Article
Open AccessWe apply a soft computing method to generate country-specific economic sentiment indicators that provide estimates of year-on-year GDP growth rates for 19 European economies. First, genetic programming is used to evolve ... -
Systematic detection of anomalous ionospheric perturbations above LEOs from GNSS POD Data including possible tsunami signatures
(Institute of Electrical and Electronics Engineers (IEEE), 2022-06-13)
Article
Open AccessIn this article, we show the capability of a global navigation satellite system (GNSS) precise orbit determination (POD) low Earth orbit (LEO) data to detect anomalous ionospheric disturbances in the spectral range of the ... -
On the locality of attention in direct speech translation
(Association for Computational Linguistics, 2022)
Conference lecture
Open AccessTransformers have achieved state-of-the-art results across multiple NLP tasks. However, the self-attention mechanism complexity scales quadratically with the sequence length, creating an obstacle for tasks involving long ... -
Multilingual machine translation: Deep analysis of language-specific encoder-decoders
(2022-04-25)
Article
Open AccessState-of-the-art multilingual machine translation relies on a shared encoder-decoder. In this paper, we propose an alternative approach based on language-specific encoder-decoders, which can be easily extended to new ... -
High frequent in-domain word segmentation and forward translation for the WMT21 Biomedical task
(Association for Computational Linguistics, 2021)
Conference report
Open AccessThis paper reports the optimization of using the out-of-domain data in the Biomedical translation task. We firstly optimized our parallel training dataset using the BabelNet in-domain terminology words. Afterward, to ... -
Enhancing sequence-to-sequence modeling for RDF triples to natural text
(Association for Computational Linguistics, 2020)
Conference report
Open AccessEstablishes key guidelines on how, which and when Machine Translation (MT) techniques are worth applying to RDF-to-Text task. Not only do we apply and compare the most prominent MT architecture, the Transformer, but we ... -
The UPC RDF-to-Text System at WebNLG Challenge 2020
(Association for Computational Linguistics, 2020)
Conference report
Open AccessThis work describes the end-to-end system architecture presented at WebNLG Challenge 2020. The system follows the traditional Machine Translation (MT) pipeline, based on the Transformer model, applied in most text-to-text ...