Recent Submissions

  • An analysis of Twitter corpora and the differences between formal and colloquial tweets 

    González Bermúdez, Meritxell (CEUR-WS.org, 2015)
    Conference report
    Open Access
    This work reviews recent publications addressing the Twitter translation task, and highlights the lack of appropriate corpora that represents the colloquial language used in Twitter. It also discusses the most well-know ...
  • AETAS: A system for semanticizing temporal expressions from unstructured contents 

    Ardalan, Zagros; Martín Escofet, Carme; Padró, Lluís (Springer, 2015)
    Conference report
    Restricted access - publisher's policy
    AETAS is an online tool for converting text into RDF linked data with resolution of temporal expressions. AETAS follows fully SOA architecture and is accessible via web-service. It implements a novel approach for semantic ...
  • The UPC TweetMT participation : translating formal tweets using context information 

    Martínez Garcia, Eva; España Bonet, Cristina; Márquez Villodre, Luís (2015)
    Conference report
    Open Access
    In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish-Catalan language pair: a state-of-the-art phrase-based ...
  • Overview of TweetMT : a shared task on machine translation of tweets at SEPLN 2015 

    Alegria, Iñaki; Aranberri, Nora; España Bonet, Cristina; Gamallo, Pablo; Gonçalo Oliveira, Hugo; Martínez Garcia, Eva; San Vicente Roncal, Iñaki; Toral, Antonio; Zubiaga, Arkaitz (2015)
    Conference report
    Open Access
    This article presents an overview of the shared task that took place as part of the TweetMT workshop held at SEPLN 2015. The task consisted in translating collections of tweets from and to several ...
  • A factory of comparable corpora from Wikipedia 

    Barrón-Cedeño, Alberto; España Bonet, Cristina; Boldoba Trapote, Josu; Márquez Villodre, Luís (Association for Computational Linguistics, 2015)
    Conference report
    Open Access
    Multiple approaches to grab comparable data from the Web have been developed up to date. Nevertheless, coming out with a high-quality comparable corpus of a specific topic is not straightforward. We present a model ...
  • Document-level machine translation with word vector models 

    Martínez Garcia, Eva; España Bonet, Cristina; Márquez Villodre, Luís (2015)
    Conference report
    Open Access
    In this paper we apply distributional semantic information to document-level machine translation. We train monolingual and bilingual word vector models on large corpora and we evaluate them first in a cross-lingual lexical ...
  • Integració i avaluació de la competència genèrica transversal actitud adequada davant el treball en assignatures de bases de dades 

    Martín Escofet, Carme; Urpí Tubella, Antoni; Burgués Illa, Xavier; Romero Moral, Óscar; Abelló Gamazo, Alberto; Casany Guerrero, María José; Quer Bosor, Maria Carme; Rodríguez González, M. Elena (Congrés Internacional de Docència Universitària i Innovació (CIDUI), 2014)
    Conference report
    Open Access
    El canvi al nou Espai Europeu d'Educació Superior va portar a la Facultat d’Informàtica de Barcelona de la Universitat Politècnica de Catalunya a incorporar competències genèriques tranversals en els seus plans d’estudi. ...
  • Language processing infrastructure in the XLike project 

    Padró, Lluís; Agic, Zeljko; Carreras, Xavier; Fortuna, Blaz; García Cuesta, Esteban; Li, Zhixing; Stajner, Tadej; Tadic, Marko (European Language Resources Association (ELRA), 2014)
    Conference lecture
    Open Access
    This paper presents the linguistic analysis tools and its infrastructure developed within the XLike project. The main goal of the implemented tools is to provide a set of functionalities for supporting some of the main ...
  • XLike project language analysis services 

    Carreras, Xavier; Padró, Lluís; Zhang, Lei; Rettinger, Achim; Li, Zhixing; García Cuesta, Esteban; Agic, Zeljko; Bekavac, Bozo; Fortuna, Blaz; Stajner, Tadej (Association for Computational Linguistics, 2014)
    Conference lecture
    Open Access
    This paper presents the linguistic analysis infrastructure developed within the XLike project. The main goal of the implemented tools is to provide a set of functionalities supporting the XLike main objectives: Enabling ...
  • Word's vector representations meet machine translation 

    Martínez Garcia, Eva; España Bonet, Cristina; Tiedemann, Jörg; Márquez Villodre, Luís (Association for Computational Linguistics, 2014)
    Conference lecture
    Restricted access - publisher's policy
    Distributed vector representations of words are useful in various NLP tasks. We briefly review the CBOW approach and propose a bilingual application of this architecture with the aim to improve consistency and coherence ...

View more