L'àmbit de recerca del grup 'VEU' és el tractament de la parla. Investiguem tecnologies que permeten l'extracció d'informació que la veu conté: reconeixement del que es diu, l'idioma o el dialecte, característiques del parlant -qui és, la seva edat, el sexe, l'estat emocional-, la direcció del so. També treballem en la caracterització general de l'àudio, per determinar quan hi ha veu i quan hi ha altres esdeveniments acústics com música o sorolls diversos. Les tecnologies de la parla possibiliten generar veu -síntesis de veu- o modificar les seves

Recent Submissions

  • Language modelling for speaker diarization in telephonic interviews 

    India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier; Rodríguez Fonollosa, José Adrián (Elsevier, 2023-03)
    Article
    Open Access
    The aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data ...
  • Attention weights in transformer NMT fail aligning words between sequences but largely explain model predictions 

    Ferrando Monsonís, Javier; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2021)
    Conference lecture
    Open Access
    This work proposes an extensive analysis of the Transformer architecture in the Neural Machine Translation (NMT) setting. Focusing on the encoder-decoder attention mechanism, we prove that attention weights systematically ...
  • A genetic programming approach for economic forecasting with survey expectations 

    Claveria González, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador (Multidisciplinary Digital Publishing Institute, 2022-06-30)
    Article
    Open Access
    We apply a soft computing method to generate country-specific economic sentiment indicators that provide estimates of year-on-year GDP growth rates for 19 European economies. First, genetic programming is used to evolve ...
  • Systematic detection of anomalous ionospheric perturbations above LEOs from GNSS POD Data including possible tsunami signatures 

    Yang, Heng; Hernández Pajares, Manuel; Jarmolowski, Wojciech; Wielgosz, Pawel; Vadas, Sharon L.; Colombo, Oscar L.; Monte Moreno, Enrique; García Rigo, Alberto; Graffigna, Victoria; Krypiak-Gregorczyk, Anna; Milanowska, Beata; Bofill Soliguer, Pablo; Olivares Pulido, Germán; Liu, Qi; Haagmans, Roger (Institute of Electrical and Electronics Engineers (IEEE), 2022-06-13)
    Article
    Open Access
    In this article, we show the capability of a global navigation satellite system (GNSS) precise orbit determination (POD) low Earth orbit (LEO) data to detect anomalous ionospheric disturbances in the spectral range of the ...
  • On the locality of attention in direct speech translation 

    Alastruey Lasheras, Belén; Ferrando Monsonís, Javier; Gallego Olsina, Gerard Ion; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2022)
    Conference lecture
    Open Access
    Transformers have achieved state-of-the-art results across multiple NLP tasks. However, the self-attention mechanism complexity scales quadratically with the sequence length, creating an obstacle for tasks involving long ...
  • Multilingual machine translation: Deep analysis of language-specific encoder-decoders 

    Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2022-04-25)
    Article
    Open Access
    State-of-the-art multilingual machine translation relies on a shared encoder-decoder. In this paper, we propose an alternative approach based on language-specific encoder-decoders, which can be easily extended to new ...
  • High frequent in-domain word segmentation and forward translation for the WMT21 Biomedical task 

    Rafieian, Bardia; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2021)
    Conference report
    Open Access
    This paper reports the optimization of using the out-of-domain data in the Biomedical translation task. We firstly optimized our parallel training dataset using the BabelNet in-domain terminology words. Afterward, to ...
  • Enhancing sequence-to-sequence modeling for RDF triples to natural text 

    Domingo Roig, Oriol; Bergés Lladó, David; Cantenys Sabà, Roser; Creus Castanyer, Roger; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2020)
    Conference report
    Open Access
    Establishes key guidelines on how, which and when Machine Translation (MT) techniques are worth applying to RDF-to-Text task. Not only do we apply and compare the most prominent MT architecture, the Transformer, but we ...
  • The UPC RDF-to-Text System at WebNLG Challenge 2020 

    Bergés Lladó, David; Cantenys Sabà, Roser; Creus Castanyer, Roger; Domingo Roig, Oriol; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2020)
    Conference report
    Open Access
    This work describes the end-to-end system architecture presented at WebNLG Challenge 2020. The system follows the traditional Machine Translation (MT) pipeline, based on the Transformer model, applied in most text-to-text ...
  • The TALP-UPC participation in WMT21 news translation task: an mBART-based NMT approach 

    Escolano Peinado, Carlos; Tsiamas, Ioannis; Basta, Christine Raouf Saad; Ferrando Monsonís, Javier; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2021)
    Conference report
    Open Access
    This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission ...
  • Method for forecasting ionospheric electron content fluctuations based on the optical flow algorithm 

    Monte Moreno, Enrique; Hernández Pajares, Manuel; Yang, Heng; García Rigo, Alberto; Jin, Yaqi; Høeg, Per; Miloch, Wojciech J.; Wielgosz, Pawel; Jarmolowski, Wojciech; Paziewski, Jacek; Milanowska, Beata; Hoque, Mainul; Orús Pérez, Raul (2022-01-01)
    Article
    Open Access
    We present the optical flow algorithm for forecasting the rate of total electron content index (OFROTI). It consists of a method for predicting maps of rapid fluctuations of ionospheric electron content in terms of global ...
  • Forecast of the global TEC by nearest neighbour technique 

    Monte Moreno, Enrique; Yang, Heng; Hernández Pajares, Manuel (Multidisciplinary Digital Publishing Institute (MDPI), 2022-03-11)
    Article
    Open Access
    We propose a method for Global Ionospheric Maps of Total Electron Content forecasting using the Nearest Neighbour method. The assumption is that in a database of global ionosphere maps spanning more than two solar cycles, ...

View more