Ara es mostren els items 102-121 de 135

    • Search engine for multilingual audiovisual contents 

      Pérez, José David; Bonafonte Cávez, Antonio; Ruiz Costa-Jussà, Marta; Cardenal, Antonio; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción; Navas, Eva; Rodríguez Banga, Eduardo (2012)
      Comunicació de congrés
      Accés obert
      This paper describes the BUCEADOR search engine, a web server that allows retrieving. multimedia documents (text, audio, video) in different languages. All the documents are translated into the user language and are ...
    • Segmentation strategies to face morphology challenges in Brazilian-Portuguese/English statistical machine translation and its integration in cross-language information retrieval 

      Ruiz Costa-Jussà, Marta (2015-06-01)
      Article
      Accés obert
      The use of morphology is particularly interesting in the context of statistical machine translation in order to reduce data sparseness and compensate any lack of training corpus. In this work, we propose several approaches ...
    • Selection of correction candidates for the normalization of Spanish user generated content 

      Melero, Maite; Ruiz Costa-Jussà, Marta; Lambert, Patrik; Quixal, Martí (2016-01-01)
      Article
      Accés obert
      We present research aiming to build tools for the normalization of User-Generated Content (UGC). We argue that processing this type of text requires the revisiting of the initial steps of Natural Language Processing, since ...
    • Semantic and syntactic information for neural machine translation: Injecting features to the transformer 

      Armengol Estapé, Jordi; Ruiz Costa-Jussà, Marta (2021-05-18)
      Article
      Accés obert
      Introducing factors such as linguistic features has long been proposed in machine translation to improve the quality of translations. More recently, factored machine translation has proven to still be useful in the case ...
    • SHAS: approaching optimal segmentation for end-to-end speech translation 

      Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (2022-02)
      Report de recerca
      Accés obert
      Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...
    • SHAS: approaching optimal segmentation for end-to-end speech translation 

      Tsiamas, Ioannis; Gallego Olsina, Gerard Ion; Fonollosa, José A. R.; Ruiz Costa-Jussà, Marta (International Speech Communication Association (ISCA), 2022)
      Text en actes de congrés
      Accés obert
      Speech translation models are unable to directly process long audios, like TED talks, which have to be split into shorter segments. Speech translation datasets provide manual segmentations of the audios, which are not ...
    • State-of-the-Art word reordering approaches in statistical machine translation: a survey 

      Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2009-11-01)
      Article
      Accés obert
      This paper surveys several state-of-the-art reordering techniques employed in Statistical Machine Translation systems. Reordering is understood as the word-order redistribution of the translated words. In original SMT ...
    • Statistical machine translation enhancements through linguistic levels: a survey 

      Ruiz Costa-Jussà, Marta; Farrus, Mireia (2014-01)
      Article
      Accés restringit per política de l'editorial
      Machine translation can be considered a highly interdisciplinary and multidisciplinary field because it is approached from the point of view of human translators, engineers, computer scientists, mathematicians, and linguists. ...
    • Study and correlation analysis of linguistic, perceptual and automatic machine translation evaluations 

      Farrus, Mireia; Ruiz Costa-Jussà, Marta; Popovic, Maya; Henriquez, Carlos A (2012-01-01)
      Article
      Accés obert
      Evaluation of machine translation output is an important task. Various human evaluation techniques as well as automatic metrics have been proposed and investigated in the last decade. However, very few evaluation methods ...
    • Syntax-driven iterative expansion language models for controllable text generation 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2020)
      Comunicació de congrés
      Accés obert
      The dominant language modeling paradigm handles text as a sequence of discrete tokens. While that approach can capture the latent structure of the text, it is inherently constrained to sequential dynamics for text generation. ...
    • Terminology-aware segmentation and domain feature for the WMT19 biomedical translation task 

      Carrino, Casimiro Pio; Rafieian, Bardia; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2019)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In this work, we give a description of the TALP-UPC systems submitted for the WMT19 Biomedical Translation Task. Our proposed strategy is NMT model-independent and relies only on one ingredient, a biomedical terminology ...
    • The IPN-CIC team system submission for the WMT 2020 similar language task 

      Menéndez-Salazar, Luis A.; Sidorov, Grigori; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2020)
      Comunicació de congrés
      Accés obert
      This paper describes the participation of the NLP research team of the IPN Computer Research center in the WMT 2020 Similar Language Translation Task. We have submitted systems for the Spanish-Portuguese language pair (in ...
    • The TALP & I2R SMT Systems for IWSLT 2008 

      Li, H.; Aw, A.; Zhang, Ming; Khalilov, Maxim; Ruiz Costa-Jussà, Marta; Henríquez Quintana, Carlos Alberto; Rodríguez Fonollosa, José Adrián; Hernández, A.; Mariño Acebal, José Bernardo; Banchs Martínez, Rafael Enrique; Chen, B. (NICT/ATR, 2008-10-31)
      Comunicació de congrés
      Accés obert
      This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Polit`ecnica de Catalunya) for our participation in the IWSLT’08 evaluation ...
    • The TALP on-line Spanish-Catalan machine-translation system 

      Poch, M; Farrús Cabeceran, Mireia; Ruiz Costa-Jussà, Marta; Mariño Acebal, José Bernardo; Hernández, Adolfo; Henríquez Quintana, Carlos Alberto; Rodríguez Fonollosa, José Adrián (2009-09)
      Comunicació de congrés
      Accés obert
      In this paper the statistical machine translator (SMT) between Catalan and Spanish developed at the TALP research center (UPC) and its web demonstration are described.
    • The TALP-UPC machine translation systems for WMT18 news translation shared task 

      Casas, Noe; Escolano Peinado, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2018)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      In this article we describe the TALP-UPC research group participation in the WMT18 news shared translation task for FinnishEnglish and Estonian-English within the multi-lingual subtrack. All of our primary submissions ...
    • The TALP-UPC machine translation systems for WMT19 news translation task: pivoting techniques for low resource MT 

      Casas Manzanares, Noé; Rodríguez Fonollosa, José Adrián; Escolano Peinado, Carlos; Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta (Association for Computational Linguistics, 2019)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot ...
    • The TALP-UPC neural machine translation system for german/finnish-english using the inverse direction model in rescoring 

      Escolano, Carlos; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2017)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      In this paper, we describe the TALP- UPC participation in the News Task for German-English and Finish-English. Our primary submission implements a fully character to character neural machine translation architecture with ...
    • The TALP-UPC participation in WMT21 news translation task: an mBART-based NMT approach 

      Escolano Peinado, Carlos; Tsiamas, Ioannis; Basta, Christine Raouf Saad; Ferrando Monsonís, Javier; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2021)
      Text en actes de congrés
      Accés obert
      This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission ...
    • The TALP-UPC phrase-based translation system for EACL-WMT 2009 

      Rodríguez Fonollosa, José Adrián; Khalilov, Maxim; Ruiz Costa-Jussà, Marta; Henríquez Quintana, Carlos Alberto; Hernández, Adolfo; Banchs Martínez, Rafael Enrique (2009-03-30)
      Comunicació de congrés
      Accés obert
      This study presents the TALP-UPC submission to the EACL Fourth Worskhop on Statistical Machine Translation 2009 evaluation campaign. It outlines the architecture and configuration of the 2009 phrase-based statistical ...
    • The TALP-UPC phrase-based translation systems for WMT13: system combination with morphology generation, domain adaptation and corpus filtering 

      Formiga Fanals, Lluís; Ruiz Costa-Jussà, Marta; Mariño Acebal, José Bernardo; Rodríguez Fonollosa, José Adrián; Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      This paper describes the TALP participation in the WMT13 evaluation campaign. Our participation is based on the combination of several statistical machine translation systems: based on standard hrasebased Moses systems. ...