Now showing items 1-14 of 14

    • Byte-based neural machine translation 

      Ruiz Costa-Jussà, Marta; Escolano Peinado, Carlos; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2017)
      Conference report
      Open Access
      This paper presents experiments compar- ing character-based and byte-based neural machine translation systems. The main motivation of the byte-based neural ma- chine translation system is to build multi- lingual neural ...
    • Chinese-Catalan: A neural machine translation approach based on pivoting and attention mechanisms 

      Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé; Escolano Peinado, Carlos; Rodríguez Fonollosa, José Adrián (2019-01-01)
      Article
      Open Access
      This article innovatively addresses machine translation from Chinese to Catalan using neural pivot strategies trained without any direct parallel data. The Catalan language is very similar to Spanish from a linguistic point ...
    • End-to-end speech translation with the transformer 

      Cross Vila, Laura; Escolano Peinado, Carlos; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta (Antonio Bonafonte, Jordi Luque and Francesc Alías Pujol, 2018)
      Conference lecture
      Restricted access - publisher's policy
      Speech Translation has been traditionally addressed with the concatenation of two tasks: Speech Recognition and Machine Translation. This approach has the main drawback that errors are concatenated. Recently, neural ...
    • From bilingual to multilingual neural machine translation by incremental training 

      Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2019)
      Conference lecture
      Open Access
      Multilingual Neural Machine Translation approaches are based on the use of task specific models and the addition of one more language can only be done by retraining the whole system. In this work, we propose a new training ...
    • From bilingual to multilingual neural-based machine translation by incremental training 

      Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2020-08-02)
      Article
      Open Access
      A common intermediate language representation in neural machine translation can be used to extend bilingual systems by incremental training. We propose a new architecture based on introducing an interlingual loss as an ...
    • Gender bias in multilingual neural machine translation: The architecture matters 

      Ruiz Costa-Jussà, Marta; Escolano Peinado, Carlos; Basta, Christine Raouf Saad; Ferrando Monsonís, Javier; Batlle, Roser; Kharitonova, Ksenia (2020-12-24)
      External research report
      Open Access
      Multilingual Neural Machine Translation architectures mainly differ in the amount of sharing modules and parameters among languages. In this paper, and from an algorithmic perspective, we explore if the chosen architecture, ...
    • Integración de conocimiento morfológico en un sistema de traducción estadístico chino-castellano 

      Escolano Peinado, Carlos (Universitat Politècnica de Catalunya, 2016-06-29)
      Bachelor thesis
      Open Access
      Con este proyecto pretendemos a partir de una traducción chino-castellano simplificado, en la cual hemos eliminado la información morfológica, crear una arquitectura que permita recuperar esa información y generar una ...
    • Interlingua based neural machine translation 

      Escolano Peinado, Carlos (Universitat Politècnica de Catalunya, 2018-06-27)
      Master thesis
      Open Access
      Covenantee:   Universitat de Barcelona / Universitat Rovira i Virgili
      We propose a machine translation architecture based on autoencoders and a shared interlingua representation that produce comparable results to state of the art systems. Also we define evaluation and visualization strategies ...
    • Multilingual machine translation: Closing the gap between shared and language-specific encoder-decoders 

      Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián; Artetxe Zurutuza, Mikel (Association for Computational Linguistics, 2021)
      Conference lecture
      Open Access
      State-of-the-art multilingual machine translation relies on a universal encoder-decoder, which requires retraining the entire system to add new languages. In this paper, we propose an alternative approach that is based on ...
    • Multilingual, multi-scale and multi-layer visualization of sequence-based intermediate representations 

      Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Lacroux, Elora; Vázquez Alcocer, Pere Pau (Association for Computational Linguistics, 2019)
      Conference report
      Restricted access - publisher's policy
      The main alternatives nowadays to dealwith sequences are Recurrent Neural Net-works (RNN), Convolutional Neural Networks(CNN) architectures and the Transformer. Inthis context, RNN’s, CNN’s and Transformerhave most commonly ...
    • The TALP-UPC machine translation systems for WMT18 news translation shared task 

      Casas, Noe; Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2018)
      Conference lecture
      Restricted access - publisher's policy
      In this article we describe the TALP-UPC research group participation in the WMT18 news shared translation task for FinnishEnglish and Estonian-English within the multi-lingual subtrack. All of our primary submissions ...
    • The TALP-UPC machine translation systems for WMT19 news translation task: pivoting techniques for low resource MT 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Escolano Peinado, Carlos; Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2019)
      Conference report
      Restricted access - publisher's policy
      In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot ...
    • The TALP-UPC system description for WMT20 news translation task: multilingual adaptation for low resource MT 

      Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2020)
      Conference lecture
      Open Access
      In this article, we describe the TALP-UPC participation in the WMT20 news translation shared task for Tamil-English. Given the low amount of parallel training data, we resort to adapt the task to a multilingual system to ...
    • The TALP–UPC Spanish–English WMT biomedical task: bilingual embeddings and char-based neural language model rescoring in a phrase-based system 

      Ruiz Costa-Jussà, Marta; España-i-Bonet, Cristina; Madhyastha, Pranava; Escolano Peinado, Carlos; Rodríguez Fonollosa, José Adrián (2016)
      Conference report
      Open Access
      This paper describes the TALP–UPC system in the Spanish–English WMT 2016 biomedical shared task. Our system is a standard phrase-based system enhanced with vocabulary expansion using bilingual word embeddings and a ...