Now showing items 1-20 of 22

    • A latent variable ranking model for content-based retrieval 

      Quattoni, Ariadna Julieta; Carreras Pérez, Xavier; Torralba, Antonio (Springer, 2012)
      Conference report
      Restricted access - publisher's policy
      Since their introduction, ranking SVM models have become a powerful tool for training content-based retrieval systems. All we need for training a model are retrieval examples in the form of triplet constraints, i.e. examples ...
    • A Proposal for wide-coverage Spanish named entity recognition 

      Arévalo, M.; Carreras Pérez, Xavier; Màrquez Villodre, Lluís; Martí Antonin, Maria Antònia; Padró, Lluís; Simon, Maria José (2002-04)
      Research report
      Open Access
      This paper presents a proposal for wide--coverage Named Entity Recognition for Spanish. First, a linguistic description of the typology of Named Entities is proposed. Following this definition an architecture of sequential ...
    • A shortest-path method for arc-factored semantic role labeling 

      Lluis Martorell, Xavier; Carreras Pérez, Xavier; Márquez Villodre, Luís (2014)
      Conference lecture
      Restricted access - publisher's policy
      We introduce a Semantic Role Labeling (SRL) parser that finds semantic roles for a predicate together with the syntactic paths linking predicates and arguments. Our main contribution is to formulate SRL in terms of ...
    • An empirical study of semi-supervised structured conditional models for dependency parsing 

      Suzuki, Jun; Isozaki, Hideki; Carreras Pérez, Xavier; Collins, Michael (2009)
      Conference report
      Open Access
      This paper describes an empirical study of high-performance dependency parsers based on a semi-supervised learning approach. We describe an extension of semisupervised structured conditional models (SS-SCMs) to the dependency ...
    • Boosting trees for anti-spam email filtering (Extended version) 

      Carreras Pérez, Xavier; Màrquez Villodre, Lluís (2001-10)
      Research report
      Open Access
      In this work, a set of comparative experiments for the problem of automatically filtering unwanted electronic mail messages are performed on two public corpora: PU1 and LingSpam. Several variants of the AdaBoost algorithm ...
    • Exploiting diversity of margin-based classifiers 

      Romero Merino, Enrique; Carreras Pérez, Xavier; Màrquez Villodre, Lluís (2003-12)
      Research report
      Open Access
      An experimental comparison among Support Vector Machines, AdaBoost and a recently proposed model for maximizing the margin with Feed-forward Neural Networks has been made on a real-world classification problem, namely ...
    • Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks 

      Collins, Michael; Globerson, Amir; Koo, Terry; Carreras Pérez, Xavier; Bartlett, Peter (2008-08)
      Article
      Open Access
      Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore ...
    • Joint arc-factored parsing of syntactic and semantic dependencies 

      Lluis Martorell, Xavier; Carreras Pérez, Xavier; Màrquez Villodre, Lluís (2013-05)
      Article
      Restricted access - publisher's policy
      In this paper we introduce a joint arc-factored model for syntactic and semantic dependency parsing. The semantic role labeler predicts the full syntactic paths that connect predicates with their arguments. This process ...
    • Learning task-specific bilexical embeddings 

      Madhyastha, Pranava S.; Carreras Pérez, Xavier; Quattoni, Ariadna Julieta (2014)
      Conference report
      Open Access
      We present a method that learns bilexical operators over distributional representations of words and leverages supervised data for a linguistic relation. The learning algorithm exploits lowrank bilinear forms and induces ...
    • Margin maximization with feed-forward neural networks: a comparative study with support vector machines and AdaBoost 

      Romero Merino, Enrique; Màrquez Villodre, Lluís; Carreras Pérez, Xavier (2003-06)
      Research report
      Open Access
      Feed-forward Neural Networks (FNN) and Support Vector Machines (SVM) are two machine learning frameworks developed from very different starting points of view. In this work a new learning model for FNN is proposed such ...
    • Non-projective parsing for statistical machine translation 

      Carreras Pérez, Xavier; Collins, Michael (2009)
      Conference report
      Open Access
      We describe a novel approach for syntaxbased statistical MT, which builds on a variant of tree adjoining grammar (TAG). Inspired by work in discriminative dependency parsing, the key idea in our approach is to allow highly ...
    • Projective dependency parsing with perceptron 

      Carreras Pérez, Xavier; Surdeanu, Mihai; Màrquez Villodre, Lluís (2010)
      Conference report
      Open Access
      We describe an online learning dependency parser for the CoNLL-X Shared Task, based on the bottom-up projective algorithm of Eisner (2000). We experiment with a large feature set that models: the tokens involved in ...
    • Simple semi-supervised dependency parsing 

      Koo, Terry; Carreras Pérez, Xavier; Collins, Michael (2008)
      Conference report
      Open Access
      We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated ...
    • Spectral learning in non-deterministic dependency parsing 

      Luque, Franco M.; Quattoni, Ariadna Julieta; Balle Pigem, Borja de; Carreras Pérez, Xavier (2012)
      Conference report
      Restricted access - publisher's policy
      In this paper we study spectral learning methods for non-deterministic split head-automata grammars, a powerful hidden-state formalism for dependency parsing. We present a learning algorithm that, like other spectral ...
    • Spectral learning of weighted automata: a forward-backward perspective 

      Balle Pigem, Borja de; Carreras Pérez, Xavier; Luque, Franco M.; Quattoni, Ariadna Julieta (2013-10-07)
      Article
      Restricted access - publisher's policy
      In recent years we have seen the development of efficient provably correct algorithms for learning Weighted Finite Automata (WFA). Most of these algorithms avoid the known hardness results by defining parameters beyond the ...
    • Spectral regularization for max-margin sequence tagging 

      Quattoni, Ariadna Julieta; Balle Pigem, Borja de; Carreras Pérez, Xavier; Globerson, Amir (2014)
      Conference report
      Open Access
      We frame max-margin learning of latent variable structured prediction models as a convex optimization problem, making use of scoring functions computed by input-output observable operator models. This learning problem can ...
    • Structured prediction models via the matrix-tree theorem 

      Koo, Terry; Globerson, Amir; Carreras Pérez, Xavier; Collins, Michael (2007)
      Conference report
      Open Access
      This paper provides an algorithmic framework for learning statistical models involving directed spanning trees, or equivalently non-projective dependency structures. We show how partition functions and marginals for directed ...
    • TAG, dynamic programming, and the perceptron for efficient, feature-rich parsing 

      Carreras Pérez, Xavier; Collins, Michael; Koo, Terry (Coling 2008 Organizing Committee, 2008)
      Conference report
      Open Access
      We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows a rich set of parse-tree ...
    • Translate first reorder later: leveraging monotonicity in semantic parsing 

      Cazzaro, Francesco; Locatelli, Davide; Quattoni, Ariadna Julieta; Carreras Pérez, Xavier (Association for Computational Linguistics, 2023)
      Conference report
      Open Access
      Prior work in semantic parsing has shown that conventional seq2seq models fail at compositional generalization tasks. This limitation led to a resurgence of methods that model alignments between sentences and their ...
    • Unsupervised spectral learning of finite-state transducers 

      Bailly, Raphaël; Carreras Pérez, Xavier; Quattoni, Ariadna Julieta (2012)
      Conference report
      Open Access
      Finite-State Transducers (FST) are a standard tool for modeling paired inputoutput sequences and are used in numerous applications, ranging from computational biology to natural language processing. Recently Balle et al. ...