Recent Submissions

  • Combining subword representations into word-level representations in the transformer architecture 

    Casas Manzanares, Noé; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (Association for Computational Linguistics, 2020)
    Conference lecture
    Open Access
    In Neural Machine Translation, using word-level tokens leads to degradation in translation quality. The dominant approaches use subword-level tokens, but this increases the length of the sequences and makes it difficult ...
  • A Survey on multimodal data stream mining for e-learner’s emotion recognition 

    Nandi, Arijit; Xhafa Xhafa, Fatos; Subirats, Laia; Fort, Santi (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference report
    Restricted access - publisher's policy
    Emotions play a crucial role in learning. To improve and optimize electronic learning (e-Learning) outcomes, many researchers have investigated the role of emotions. Also, researchers have come up with many approaches to ...
  • Augmenting the power of (partial) MaxSat resolution with extension 

    Larrosa Bondia, Francisco Javier; Rollón Rico, Emma (AAAI Press, 2020)
    Conference lecture
    Restricted access - publisher's policy
    The refutation power of SAT and MaxSAT resolution is challenged by problems like the soft and hard Pigeon Hole Problem PHP for which short refutations do not exist. In this paper we augment the MaxSAT resolution proof ...
  • Textual visual semantic dataset for text spotting 

    Sabir, Ahmed; Moreno-Noguer, Francesc; Padró, Lluís (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference report
    Open Access
    Text Spotting in the wild consists of detecting and recognizing text appearing in images (e.g. signboards, traffic signals or brands in clothing or objects). This is a challenging problem due to the complexity of the context ...
  • Automatic Spanish translation of SQuAD dataset for multi-lingual question answering 

    Carrino, Casimiro Pio; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (European Language Resources Association (ELRA), 2020)
    Conference lecture
    Open Access
    Recently, multilingual question answering became a crucial research topic, and it is receiving increased interest in the NLP community.However, the unavailability of large-scale datasets makes it challenging to train ...
  • Synthetic dataset generation with itemset-based generative models 

    Lezcano Ríos, Christian Gerardo; Arias Vicente, Marta (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference report
    Open Access
    This paper proposes three different data generators, tailored to transactional datasets, based on existing itemset-based generative models. All these generators are intuitive and easy to implement and show satisfactory ...
  • Characterizing transactional databases for frequent itemset mining 

    Lezcano Ríos, Christian Gerardo; Arias Vicente, Marta (CEUR-WS.org, 2019)
    Conference report
    Open Access
    This paper presents a study of the characteristics of transactional databases used in frequent itemset mining. Such characterizations have typically been used to benchmark and understand the data mining algorithms working ...
  • Building graph representations of deep vector embeddings 

    Garcia Gasulla, Dario; Vilalta Arias, Armand; Parés Pont, Ferran; Moreno Vázquez, Jonatan; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (Association for Computational Linguistics, 2017)
    Conference lecture
    Open Access
    Patterns stored within pre-trained deep neural networks compose large and powerful descriptive languages that can be used for many different purposes. Typically, deep network representations are implemented within vector ...
  • Full-network embedding in a multimodal embedding pipeline 

    Vilalta Arias, Armand; Garcia Gasulla, Dario; Parés Pont, Ferran; Moreno Vázquez, Jonatan; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (Association for Computational Linguistics, 2017)
    Conference lecture
    Open Access
    The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this ...
  • Evaluating the underlying gender bias in contextualized word embeddings 

    Basta, Christine Raouf Saad; Ruiz Costa-Jussà, Marta; Casas Manzanares, Noé (Association for Computational Linguistics, 2019)
    Conference report
    Open Access
    Gender bias is highly impacting natural language processing applications. Word embeddings have clearly been proven both to keep and amplify gender biases that are present in current data sources. Recently, contextualized ...
  • Improving accuracy and speeding up document image classification through parallel systems 

    Ferrando Monsonís, Javier; Domínguez, Juan Luis; Torres Viñals, Jordi; García Fuentes, Raul; García Doménech, David; Garrido Miñambres, Daniel; Cortada, Jordi; Valero Cortés, Mateo (Springer, 2020)
    Conference report
    Open Access
    This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process ...
  • A Parser-based tool to assist instructors in grading computer graphics assignments 

    Andújar Gran, Carlos Antonio; Vijulie, Cristina Raluca; Vinacua Pla, Álvaro (European Association for Computer Graphics (Eurographics), 2019)
    Conference report
    Open Access
    Although online e-learning environments are increasingly used in university courses, manual assessment still dominates the way students are graded. Interactive judges providing a pass/fail verdict based on test sets are ...

View more