Now showing items 1-6 of 6

    • A factory of comparable corpora from Wikipedia 

      Barrón-Cedeño, Alberto; España Bonet, Cristina; Boldoba Trapote, Josu; Márquez Villodre, Luís (Association for Computational Linguistics, 2015)
      Conference report
      Open Access
      Multiple approaches to grab comparable data from the Web have been developed up to date. Nevertheless, coming out with a high-quality comparable corpus of a specific topic is not straightforward. We present a model ...
    • A shortest-path method for arc-factored semantic role labeling 

      Lluis Martorell, Xavier; Carreras Pérez, Xavier; Márquez Villodre, Luís (2014)
      Conference lecture
      Restricted access - publisher's policy
      We introduce a Semantic Role Labeling (SRL) parser that finds semantic roles for a predicate together with the syntactic paths linking predicates and arguments. Our main contribution is to formulate SRL in terms of ...
    • Document-level machine translation with word vector models 

      Martínez Garcia, Eva; España Bonet, Cristina; Márquez Villodre, Luís (2015)
      Conference report
      Open Access
      In this paper we apply distributional semantic information to document-level machine translation. We train monolingual and bilingual word vector models on large corpora and we evaluate them first in a cross-lingual lexical ...
    • Experiments on document level machine translation 

      Martínez Garcia, Eva; España Bonet, Cristina; Márquez Villodre, Luís (2014-03-03)
      Research report
      Open Access
      Most of the current SMT systems work at sentence level. They translate a text assuming that sentences are independent, but, when one looks at a well formed document, it is clear that there exist many inter sentence relations. ...
    • The UPC TweetMT participation : translating formal tweets using context information 

      Martínez Garcia, Eva; España Bonet, Cristina; Márquez Villodre, Luís (2015)
      Conference report
      Open Access
      In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish-Catalan language pair: a state-of-the-art phrase-based ...
    • Word's vector representations meet machine translation 

      Martínez Garcia, Eva; España Bonet, Cristina; Tiedemann, Jörg; Márquez Villodre, Luís (Association for Computational Linguistics, 2014)
      Conference lecture
      Restricted access - publisher's policy
      Distributed vector representations of words are useful in various NLP tasks. We briefly review the CBOW approach and propose a bilingual application of this architecture with the aim to improve consistency and coherence ...