Reports de recerca
http://hdl.handle.net/2117/3488
2016-05-28T04:14:14ZAn optimal anytime estimation algorithm
http://hdl.handle.net/2117/87244
An optimal anytime estimation algorithm
Gavaldà Mestre, Ricard
In many applications a key step is estimating some unknown quantity ~$mu$ from a sequence of trials, each having expected value~$mu$. Optimal algorithms are known when the task is to estimate $mu$ within a multiplicative factor of $epsilon$, for an $epsilon$ given in advance. In this paper we consider {em anytime} approximation algorithms, i.e., algorithms that must give a reliable approximation after each trial, and whose approximations have to be increasingly accurate as the number of trials grows. We give an anytime algorithm for this task when the only a-priori known property of $mu$ is its range, and show that it is asymptotically optimal in some cases, in the sense that no correct anytime algorithm can give asymptotically better approximations. The key ingredient is a new large deviation bound for the supremum of the deviations in an infinite sequence of trials, which can be seen as a non-limit analog of the classical Law of the Iterated Logarithm.
2016-05-23T14:19:52ZGavaldà Mestre, RicardIn many applications a key step is estimating some unknown quantity ~$mu$ from a sequence of trials, each having expected value~$mu$. Optimal algorithms are known when the task is to estimate $mu$ within a multiplicative factor of $epsilon$, for an $epsilon$ given in advance. In this paper we consider {em anytime} approximation algorithms, i.e., algorithms that must give a reliable approximation after each trial, and whose approximations have to be increasingly accurate as the number of trials grows. We give an anytime algorithm for this task when the only a-priori known property of $mu$ is its range, and show that it is asymptotically optimal in some cases, in the sense that no correct anytime algorithm can give asymptotically better approximations. The key ingredient is a new large deviation bound for the supremum of the deviations in an infinite sequence of trials, which can be seen as a non-limit analog of the classical Law of the Iterated Logarithm.Tractable clones of polynomials over semigroups
http://hdl.handle.net/2117/85850
Tractable clones of polynomials over semigroups
Dalmau, Víctor; Gavaldà Mestre, Ricard; Tesson, Pascal; Thérien, Denis
We contribute to the algebraic study of the complexity of constraint satisfaction problems. We give a new sufficient condition on a set of relations R over a domain S for the tractability of CSP(R): if S is a block-group (a particular class of semigroups) of exponent w and R is a set of relations over S preserved by the operation defined by the polynomial f(x,y,z) = xy^(w-1)z over S, then CSP(R) is tractable. This theorem strictly improves on results of Feder and Vardi and Bulatov et al. and we demonstrate it by reproving an upper bound of Klima et al. We also investigate systematically the tractability of CSP(R) when R is a set of relations closed under operations that are all expressible as polynomials over a finite semigroup S. In particular, if S is a nilpotent group, we show that CSP(R) is tractable iff one of these polynomials defines a Malt'sev operation, and conjecture that this holds for all groups.
2016-04-19T07:00:57ZDalmau, VíctorGavaldà Mestre, RicardTesson, PascalThérien, DenisWe contribute to the algebraic study of the complexity of constraint satisfaction problems. We give a new sufficient condition on a set of relations R over a domain S for the tractability of CSP(R): if S is a block-group (a particular class of semigroups) of exponent w and R is a set of relations over S preserved by the operation defined by the polynomial f(x,y,z) = xy^(w-1)z over S, then CSP(R) is tractable. This theorem strictly improves on results of Feder and Vardi and Bulatov et al. and we demonstrate it by reproving an upper bound of Klima et al. We also investigate systematically the tractability of CSP(R) when R is a set of relations closed under operations that are all expressible as polynomials over a finite semigroup S. In particular, if S is a nilpotent group, we show that CSP(R) is tractable iff one of these polynomials defines a Malt'sev operation, and conjecture that this holds for all groups.Spectral learning of transducers over continuous sequences
http://hdl.handle.net/2117/20362
Spectral learning of transducers over continuous sequences
Recasens, Adria; Quattoni, Ariadna Julieta
In this paper we present a spectral algorithm for learning weighted nite state transducers (WFSTs) over paired input-output sequences, where the input is continuous and the output discrete. WFSTs are an important tool for modeling paired input-output sequences and have numerous applications in
real-world problems. Recently, Balle et al (2011) proposed a spectral method for learning WFSTs that overcomes some of the well known limitations of gradient-based or EM optimizations which can be computationally expensive and su er from local optima issues. Their algorithm can model distributions where both inputs and outputs are sequences from a discrete alphabet.
However, many real world problems require modeling paired sequences where the inputs are not discrete but continuos sequences. Modelling continuous sequences with spectral methods has been studied in the context of HMMs (Song et al 2010), where a spectral algorithm for this case was derived. In this
paper we follow that line of work and propose a spectral learning algorithm
for modelling paired input-output sequences where the inputs are continuous and the outputs are discrete. Our approach is based on generalizing the class of weighted nite state transducers over discrete input-output sequences to a class where transitions are linear combinations of elementary transitions and the weights of this linear combinations are determined by dynamic features of the continuous input sequence.
At its core, the algorithm is simple and scalable to large data sets. We present experiments on a real task that validate the eff ectiveness of the proposed approach.
2013-10-11T10:15:39ZRecasens, AdriaQuattoni, Ariadna JulietaIn this paper we present a spectral algorithm for learning weighted nite state transducers (WFSTs) over paired input-output sequences, where the input is continuous and the output discrete. WFSTs are an important tool for modeling paired input-output sequences and have numerous applications in
real-world problems. Recently, Balle et al (2011) proposed a spectral method for learning WFSTs that overcomes some of the well known limitations of gradient-based or EM optimizations which can be computationally expensive and su er from local optima issues. Their algorithm can model distributions where both inputs and outputs are sequences from a discrete alphabet.
However, many real world problems require modeling paired sequences where the inputs are not discrete but continuos sequences. Modelling continuous sequences with spectral methods has been studied in the context of HMMs (Song et al 2010), where a spectral algorithm for this case was derived. In this
paper we follow that line of work and propose a spectral learning algorithm
for modelling paired input-output sequences where the inputs are continuous and the outputs are discrete. Our approach is based on generalizing the class of weighted nite state transducers over discrete input-output sequences to a class where transitions are linear combinations of elementary transitions and the weights of this linear combinations are determined by dynamic features of the continuous input sequence.
At its core, the algorithm is simple and scalable to large data sets. We present experiments on a real task that validate the eff ectiveness of the proposed approach.Frequent sets, sequences and taxonomies: new efficient algorithmic proposals
http://hdl.handle.net/2117/14824
Frequent sets, sequences and taxonomies: new efficient algorithmic proposals
Baixeries i Juvillà, Jaume; Casas Garriga, Gemma; Balcázar Navarro, José Luis
We describe efficient algorithmic proposals to approach three fundamental problems in data mining: association rules, episodes in sequences, and generalized association rules over hierarchical taxonomies. The association rule discovery problem aims at identifying frequent itemsets in a database and then forming conditional implication rules among them. For this association task, we will introduce a new algorithmic proposal to reduce substantially the number of processed transactions. The resulting algorithm, called Ready-and-Go, is used to discover frequent sets efficiently. Then, for the discovery of patterns in sequences of events in ordered collections of data, we propose to apply the appropiate variant of that algorithm, and additionally we introduce a new framework for the formalization of the concept of intereseting episodes. Finally, we adapt our algorithm to the generalization of the frequent sets problem where data comes organized in taxonomic hierarchies, and here additionally we contribute with a new heuristic that, under certain natural conditions, improves the performance.
2012-01-26T10:47:30ZBaixeries i Juvillà, JaumeCasas Garriga, GemmaBalcázar Navarro, José LuisWe describe efficient algorithmic proposals to approach three fundamental problems in data mining: association rules, episodes in sequences, and generalized association rules over hierarchical taxonomies. The association rule discovery problem aims at identifying frequent itemsets in a database and then forming conditional implication rules among them. For this association task, we will introduce a new algorithmic proposal to reduce substantially the number of processed transactions. The resulting algorithm, called Ready-and-Go, is used to discover frequent sets efficiently. Then, for the discovery of patterns in sequences of events in ordered collections of data, we propose to apply the appropiate variant of that algorithm, and additionally we introduce a new framework for the formalization of the concept of intereseting episodes. Finally, we adapt our algorithm to the generalization of the frequent sets problem where data comes organized in taxonomic hierarchies, and here additionally we contribute with a new heuristic that, under certain natural conditions, improves the performance.An integer linear programming representation for data-center power-aware management
http://hdl.handle.net/2117/11061
An integer linear programming representation for data-center power-aware management
Berral García, Josep Lluís; Gavaldà Mestre, Ricard; Torres Viñals, Jordi
This work exposes how to represent a grid data-center based scheduling problem, taking the advantages of the virtualization and consolidation techniques, as a linear integer programming problem including all three mentioned factors. Although being integer linear programming (ILP) a computationally hard problem, specifying correctly its constraints and optimization function can contribute to find integer optimal solutions in relative short time. So ILP solutions can help designers and system managers not only to apply them to schedulers but also to create new heuristics and holistic functions that approximate well to the optimal solutions in a quicker way.
2011-01-17T11:21:12ZBerral García, Josep LluísGavaldà Mestre, RicardTorres Viñals, JordiThis work exposes how to represent a grid data-center based scheduling problem, taking the advantages of the virtualization and consolidation techniques, as a linear integer programming problem including all three mentioned factors. Although being integer linear programming (ILP) a computationally hard problem, specifying correctly its constraints and optimization function can contribute to find integer optimal solutions in relative short time. So ILP solutions can help designers and system managers not only to apply them to schedulers but also to create new heuristics and holistic functions that approximate well to the optimal solutions in a quicker way.