Now showing items 1-9 of 9

    • A differentiable BLEU loss. Analysis and first results 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta (2018)
      Conference report
      Open Access
      In natural language generation tasks, like neural machine translation and image captioning, there is usually a mismatch between the optimized loss and the de facto evaluation criterion, namely token-level maximum likelihood ...
    • Chinese-Catalan: A neural machine translation approach based on pivoting and attention mechanisms 

      Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé; Escolano Peinado, Carlos; Rodríguez Fonollosa, José Adrián (2019-01-01)
      Article
      Open Access
      This article innovatively addresses machine translation from Chinese to Catalan using neural pivot strategies trained without any direct parallel data. The Catalan language is very similar to Spanish from a linguistic point ...
    • Combining subword representations into word-level representations in the transformer architecture 

      Casas Manzanares, Noé; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2020)
      Conference lecture
      Open Access
      In Neural Machine Translation, using word-level tokens leads to degradation in translation quality. The dominant approaches use subword-level tokens, but this increases the length of the sequences and makes it difficult ...
    • Evaluating the underlying gender bias in contextualized word embeddings 

      Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé (Association for Computational Linguistics, 2019)
      Conference report
      Open Access
      Gender bias is highly impacting natural language processing applications. Word embeddings have clearly been proven both to keep and amplify gender biases that are present in current data sources. Recently, contextualized ...
    • Extensive study on the underlying gender bias in contextualized word embeddings 

      Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé (2021-04)
      Article
      Restricted access - publisher's policy
      Gender bias is affecting many natural language processing applications. While we are still far from proposing debiasing methods that will solve the problem, we are making progress analyzing the impact of this bias in current ...
    • Injection of linguistic knowledge into neural text generation models 

      Casas Manzanares, Noé (Universitat Politècnica de Catalunya, 2020-12-14)
      Doctoral thesis
      Open Access
      Language is an organic construct. It emanates from the need for communication and changes through time, influenced by multiple factors. The resulting language structures are a mix of regular syntactic and morphological ...
    • Linguistic knowledge-based vocabularies for Neural Machine Translation 

      Casas Manzanares, Noé; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián; Alonso, Juan; Fanlo, Ramon (Cambridge University Press, 2020)
      Article
      Open Access
      Neural Networks applied to Machine Translation need a finite vocabulary to express textual information as a sequence of discrete tokens. The currently dominant subword vocabularies exploit statistically-discovered common ...
    • Syntax-driven iterative expansion language models for controllable text generation 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2020)
      Conference lecture
      Open Access
      The dominant language modeling paradigm handles text as a sequence of discrete tokens. While that approach can capture the latent structure of the text, it is inherently constrained to sequential dynamics for text generation. ...
    • The TALP-UPC machine translation systems for WMT19 news translation task: pivoting techniques for low resource MT 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Escolano Peinado, Carlos; Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2019)
      Conference report
      Restricted access - publisher's policy
      In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot ...