Recent Submissions

  • Evaluating the underlying gender bias in contextualized word embeddings 

    Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé (Association for Computational Linguistics, 2019)
    Conference report
    Open Access
    Gender bias is highly impacting natural language processing applications. Word embeddings have clearly been proven both to keep and amplify gender biases that are present in current data sources. Recently, contextualized ...
  • I-vector transformation using k-nearest neighbors for speaker verification 

    Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference report
    Restricted access - publisher's policy
    Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids ...
  • From bilingual to multilingual neural machine translation by incremental training 

    Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2019)
    Conference lecture
    Open Access
    Multilingual Neural Machine Translation approaches are based on the use of task specific models and the addition of one more language can only be done by retraining the whole system. In this work, we propose a new training ...
  • Multilingual, multi-scale and multi-layer visualization of sequence-based intermediate representations 

    Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Lacroux, Elora; Vázquez Alcocer, Pere Pau (Association for Computational Linguistics, 2019)
    Conference report
    Restricted access - publisher's policy
    The main alternatives nowadays to dealwith sequences are Recurrent Neural Net-works (RNN), Convolutional Neural Networks(CNN) architectures and the Transformer. Inthis context, RNN’s, CNN’s and Transformerhave most commonly ...
  • Self multi-head attention for speaker recognition 

    India Massana, Miquel Àngel; Safari, Pooyan; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
    Conference lecture
    Open Access
    Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance level. Given thespeech signal, these algorithms extract a sequence of speakerembeddings from short segments and those are ...
  • Auto-encoding nearest neighbor i-vectors for speaker verification 

    Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
    Conference lecture
    Open Access
    In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ...
  • DNN speaker embeddings using autoencoder pre-training 

    Khan, Umair; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Restricted access - publisher's policy
    Over the last years, i-vectors have been the state-of-the-art approach in speaker recognition. Recent improvements in deep learning have increased the discriminative quality of i-vectors. However, deep learning architectures ...
  • The TALP-UPC machine translation systems for WMT19 news translation task: pivoting techniques for low resource MT 

    Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Escolano Peinado, Carlos; Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2019)
    Conference report
    Restricted access - publisher's policy
    In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot ...
  • Terminology-aware segmentation and domain feature for the WMT19 biomedical translation task 

    Carrino, Casimiro Pio; Rafieian, Bardia; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2019)
    Conference report
    Restricted access - publisher's policy
    In this work, we give a description of the TALP-UPC systems submitted for the WMT19 Biomedical Translation Task. Our proposed strategy is NMT model-independent and relies only on one ingredient, a biomedical terminology ...
  • BERT masked language modeling for co-reference resolution 

    Alfaro, Felipe; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2019)
    Conference report
    Open Access
    This paper explains the TALP-UPC participation for the Gendered Pronoun Resolution shared-task of the 1st ACL Workshop on Gender Bias for Natural Language Processing. We have implemented two models for mask language modeling ...
  • Wav2Pix: speech-conditioned face generation using generative adversarial networks 

    Cardoso Duarte, Amanda; Roldan, Francisco; Tubau, Miquel; Escur, Janna; Pascual de la Puente, Santiago; Salvador Aguilera, Amaia; Mohedano, Eva; McGuinness, Kevin; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Restricted access - publisher's policy
    Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a ...
  • Corpus for cyberbullying prevention 

    Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio; Jauk, Igor; Tarrés, Laia; Pereira, Victor (International Speech Communication Association (ISCA), 2018)
    Conference report
    Open Access
    Cyberbullying is the use of digital media to harass a person or group of people, through personal attacks, disclosure of confidential or false information, among other means. That is to say, it ...

View more