ALBCOM - Algorismia, Bioinformàtica, Complexitat i Mètodes Formals
http://hdl.handle.net/2117/3092
2017-03-30T14:42:10ZComputing alignments with constraint programming : the acyclic case
http://hdl.handle.net/2117/103063
Computing alignments with constraint programming : the acyclic case
Borrego, Diana; Gómez López, María Teresa; Carmona Vargas, Josep; Martínez Gasca, Rafael
Conformance checking confronts process models with real process executions to detect and measure deviations between modelled and observed behaviour. The core technique for conformance checking is the computation of an alignment. Current approaches for alignment computation rely on a shortest-path technique over the product of the state-space of a model and the observed trace, thus suffering from the well-known state explosion problem. This paper presents a fresh alternative for alignment computation of acyclic process models, that encodes the alignment problem as a Constraint Satisfaction Problem. Since modern solvers for this framework are capable of dealing with large instances, this contribution has a clear potential. Remarkably, our prototype implementation can handle instances that represent a real challenge for current techniques. Main advantages of using Constraint Programming paradigm lie in the possibility to adapt parameters such as the maximum search time, or the maximum misalignment allowed. Moreover, using search and propagation algorithms incorporated in Constraint Programming Solvers permits to find solutions for problems unsolvable with other techniques.
2017-03-30T06:45:05ZBorrego, DianaGómez López, María TeresaCarmona Vargas, JosepMartínez Gasca, RafaelConformance checking confronts process models with real process executions to detect and measure deviations between modelled and observed behaviour. The core technique for conformance checking is the computation of an alignment. Current approaches for alignment computation rely on a shortest-path technique over the product of the state-space of a model and the observed trace, thus suffering from the well-known state explosion problem. This paper presents a fresh alternative for alignment computation of acyclic process models, that encodes the alignment problem as a Constraint Satisfaction Problem. Since modern solvers for this framework are capable of dealing with large instances, this contribution has a clear potential. Remarkably, our prototype implementation can handle instances that represent a real challenge for current techniques. Main advantages of using Constraint Programming paradigm lie in the possibility to adapt parameters such as the maximum search time, or the maximum misalignment allowed. Moreover, using search and propagation algorithms incorporated in Constraint Programming Solvers permits to find solutions for problems unsolvable with other techniques.The HOM problem is EXPTIME-complete
http://hdl.handle.net/2117/102817
The HOM problem is EXPTIME-complete
Creus López, Carles; Gascon Caro, Adrian; Godoy Balil, Guillem; Ramos Garrido, Lander
We define a new class of tree automata with constraints and prove decidability of the emptiness problem for this class in exponential time. As a consequence, we obtain several EXPTIME-completeness results for problems on images of regular tree languages under tree homomorphisms, like set inclusion, regularity (HOM problem), and finiteness of set difference. Our result also has implications in term rewriting, since the set of reducible terms of a term rewrite system can be described as the image of a tree homomorphism. In particular, we prove that inclusion of sets of normal forms of term rewrite systems can be decided in exponential time. Analogous consequences arise in the context of XML typechecking, since types are defined by tree automata and some type transformations are homomorphic.
2017-03-23T09:26:34ZCreus López, CarlesGascon Caro, AdrianGodoy Balil, GuillemRamos Garrido, LanderWe define a new class of tree automata with constraints and prove decidability of the emptiness problem for this class in exponential time. As a consequence, we obtain several EXPTIME-completeness results for problems on images of regular tree languages under tree homomorphisms, like set inclusion, regularity (HOM problem), and finiteness of set difference. Our result also has implications in term rewriting, since the set of reducible terms of a term rewrite system can be described as the image of a tree homomorphism. In particular, we prove that inclusion of sets of normal forms of term rewrite systems can be decided in exponential time. Analogous consequences arise in the context of XML typechecking, since types are defined by tree automata and some type transformations are homomorphic.Construct, Merge, Solve and Adapt: Application to the repetition-free longest common subsequence problem
http://hdl.handle.net/2117/102814
Construct, Merge, Solve and Adapt: Application to the repetition-free longest common subsequence problem
Blum, Christian; Blesa Aguilera, Maria Josep
In this paper we present the application of a recently proposed, general, algorithm for combinatorial optimization to the repetition-free longest common subsequence problem. The applied algorithm, which is labelled Construct, Merge, Solve & Adapt, generates sub-instances based on merging the solution components found in randomly constructed solutions. These sub-instances are subsequently solved by means of an exact solver. Moreover, the considered sub-instances are dynamically changing due to adding new solution components at each iteration, and removing existing solution components on the basis of indicators about their usefulness. The results of applying this algorithm to the repetition-free longest common subsequence problem show that the algorithm generally outperforms competing approaches from the literature. Moreover, they show that the algorithm is competitive with CPLEX for small and medium size problem instances, whereas it outperforms CPLEX for larger problem instances.
2017-03-23T07:48:23ZBlum, ChristianBlesa Aguilera, Maria JosepIn this paper we present the application of a recently proposed, general, algorithm for combinatorial optimization to the repetition-free longest common subsequence problem. The applied algorithm, which is labelled Construct, Merge, Solve & Adapt, generates sub-instances based on merging the solution components found in randomly constructed solutions. These sub-instances are subsequently solved by means of an exact solver. Moreover, the considered sub-instances are dynamically changing due to adding new solution components at each iteration, and removing existing solution components on the basis of indicators about their usefulness. The results of applying this algorithm to the repetition-free longest common subsequence problem show that the algorithm generally outperforms competing approaches from the literature. Moreover, they show that the algorithm is competitive with CPLEX for small and medium size problem instances, whereas it outperforms CPLEX for larger problem instances.On the stability of generalized second price auctions with budgets
http://hdl.handle.net/2117/101923
On the stability of generalized second price auctions with budgets
Díaz Cort, Josep; Giotis, Ioannis; Kirousis, Lefteris; Markakis, Evangelos; Serna Iglesias, María José
The Generalized Second Price (GSP) auction used typically to model sponsored search auctions does not include the notion of budget constraints, which is present in practice. Motivated by this, we introduce the different variants of GSP auctions that take budgets into account in natural ways. We examine their stability by focusing on the existence of Nash equilibria and envy-free assignments. We highlight the differences between these mechanisms and find that only some of them exhibit both notions of stability. This shows the importance of carefully picking the right mechanism to ensure stable outcomes in the presence of budgets.
2017-03-04T12:43:43ZDíaz Cort, JosepGiotis, IoannisKirousis, LefterisMarkakis, EvangelosSerna Iglesias, María JoséThe Generalized Second Price (GSP) auction used typically to model sponsored search auctions does not include the notion of budget constraints, which is present in practice. Motivated by this, we introduce the different variants of GSP auctions that take budgets into account in natural ways. We examine their stability by focusing on the existence of Nash equilibria and envy-free assignments. We highlight the differences between these mechanisms and find that only some of them exhibit both notions of stability. This shows the importance of carefully picking the right mechanism to ensure stable outcomes in the presence of budgets.Fraud detection in energy consumption: a supervised approach
http://hdl.handle.net/2117/101913
Fraud detection in energy consumption: a supervised approach
Coma Puig, Bernat; Carmona Vargas, Josep; Gavaldà Mestre, Ricard; Alcoverro, Santiago; Martín, Victor
Data from utility meters (gas, electricity, water) is a rich source of information for distribution companies, beyond billing. In this paper we present a supervised technique, which primarily but not only feeds on meter information, to detect meter anomalies and customer fraudulent behavior (meter tampering). Our system detects anomalous meter readings on the basis of models built using machine learning techniques on past data. Unlike most previous work, it can incrementally incorporate the result of field checks to grow the database of fraud and non-fraud patterns, therefore increasing model precision over time and potentially adapting to emerging fraud patterns. The full system has been developed with a company providing electricity and gas and already used to carry out several field checks, with large improvements in fraud detection over the previous checks which used simpler techniques.
2017-03-03T12:09:34ZComa Puig, BernatCarmona Vargas, JosepGavaldà Mestre, RicardAlcoverro, SantiagoMartín, VictorData from utility meters (gas, electricity, water) is a rich source of information for distribution companies, beyond billing. In this paper we present a supervised technique, which primarily but not only feeds on meter information, to detect meter anomalies and customer fraudulent behavior (meter tampering). Our system detects anomalous meter readings on the basis of models built using machine learning techniques on past data. Unlike most previous work, it can incrementally incorporate the result of field checks to grow the database of fraud and non-fraud patterns, therefore increasing model precision over time and potentially adapting to emerging fraud patterns. The full system has been developed with a company providing electricity and gas and already used to carry out several field checks, with large improvements in fraud detection over the previous checks which used simpler techniques.Measuring satisfaction in societies with opinion leaders and mediators
http://hdl.handle.net/2117/101810
Measuring satisfaction in societies with opinion leaders and mediators
Molinero Albareda, Xavier; Riquelme Csori, F.; Serna Iglesias, María José
An opinion leader-follower model (OLF) is a two-action collective decision-making model for societies, in which three kinds of actors are considered:
2017-03-01T16:19:14ZMolinero Albareda, XavierRiquelme Csori, F.Serna Iglesias, María JoséAn opinion leader-follower model (OLF) is a two-action collective decision-making model for societies, in which three kinds of actors are considered:Partial match queries in relaxed K-dt trees
http://hdl.handle.net/2117/101520
Partial match queries in relaxed K-dt trees
Duch Brown, Amalia; Lau Laynes-Lozada, Gustavo Salvador
The study of partial match queries on random hierarchical multidimensional data structures dates back to Ph. Flajolet and C. Puech’s 1986 seminal paper on partial match retrieval. It was not until recently that fixed (as opposed to random) partial match queries were studied for random relaxed K-d trees, random standard K-d trees, and random 2-dimensional quad trees. Based on those results it seemed
natural to classify the general form of the cost of fixed partial match queries into two families: that of either random hierarchical structures or perfectly balanced structures, as conjectured by Duch, Lau and Martínez (On the Cost of Fixed Partial Queries in K-d trees Algorithmica, 75(4):684–723, 2016). Here we show that the conjecture just mentioned does not hold by introducing relaxed K-dt trees and providing the average-case analysis for random partial match queries as well as some advances on the average-case analysis for fixed partial match queries on them. In fact this cost –for fixed partial match queries– does not follow the conjectured forms.
2017-02-24T10:27:25ZDuch Brown, AmaliaLau Laynes-Lozada, Gustavo SalvadorThe study of partial match queries on random hierarchical multidimensional data structures dates back to Ph. Flajolet and C. Puech’s 1986 seminal paper on partial match retrieval. It was not until recently that fixed (as opposed to random) partial match queries were studied for random relaxed K-d trees, random standard K-d trees, and random 2-dimensional quad trees. Based on those results it seemed
natural to classify the general form of the cost of fixed partial match queries into two families: that of either random hierarchical structures or perfectly balanced structures, as conjectured by Duch, Lau and Martínez (On the Cost of Fixed Partial Queries in K-d trees Algorithmica, 75(4):684–723, 2016). Here we show that the conjecture just mentioned does not hold by introducing relaxed K-dt trees and providing the average-case analysis for random partial match queries as well as some advances on the average-case analysis for fixed partial match queries on them. In fact this cost –for fixed partial match queries– does not follow the conjectured forms.Mining structured Petri nets for the visualization of process behavior
http://hdl.handle.net/2117/100797
Mining structured Petri nets for the visualization of process behavior
San Pedro Martín, Javier de; Cortadella Fortuny, Jordi
Visualization is essential for understanding the models obtained by process mining. Clear and efficient visual representations make the embedded information more accessible and analyzable. This work presents a novel approach for generating process models with structural properties that induce visually friendly layouts. Rather than generating a single model that captures all behaviors, a set of Petri net models is delivered, each one covering a subset of traces of the log. The models are mined by extracting slices of labelled transition systems with specific properties from the complete state space produced by the process logs. In most cases, few Petri nets are sufficient to cover a significant part of the behavior produced by the log.
2017-02-10T08:35:24ZSan Pedro Martín, Javier deCortadella Fortuny, JordiVisualization is essential for understanding the models obtained by process mining. Clear and efficient visual representations make the embedded information more accessible and analyzable. This work presents a novel approach for generating process models with structural properties that induce visually friendly layouts. Rather than generating a single model that captures all behaviors, a set of Petri net models is delivered, each one covering a subset of traces of the log. The models are mined by extracting slices of labelled transition systems with specific properties from the complete state space produced by the process logs. In most cases, few Petri nets are sufficient to cover a significant part of the behavior produced by the log.Amalgamation of domain specific languages with behaviour
http://hdl.handle.net/2117/100652
Amalgamation of domain specific languages with behaviour
Duran, Francisco; Moreno Delgado, Antonio; Orejas Valdés, Fernando; Zschaler, Steffen
Domain-specific languages (DSLs) become more useful the more specific they are to a particular domain. The resulting need for developing a substantial number of DSLs can only be satisfied if DSL development can be made as efficient as possible. One way in which to address this challenge is by enabling the reuse of (partial) DSLs in the construction of new DSLs. Reuse of DSLs builds on two foundations: a notion of DSL composition and theoretical results ensuring the safeness of composing DSLs with respect to the semantics of the component DSLs.
Given a graph-grammar formalisation of DSLs, in this paper, we build on graph transformation system morphisms to define parameterised DSLs and their instantiation by an amalgamation construction. Results on the protection of the behaviour along the induced morphisms allow us to safely reuse and combine definitions of DSLs to build more complex ones. We illustrate our proposal in e-Motions for a DSL for production-line systems and three independent DSLs for describing non-functional properties, namely response time, throughput, and failure rate.
2017-02-08T08:20:14ZDuran, FranciscoMoreno Delgado, AntonioOrejas Valdés, FernandoZschaler, SteffenDomain-specific languages (DSLs) become more useful the more specific they are to a particular domain. The resulting need for developing a substantial number of DSLs can only be satisfied if DSL development can be made as efficient as possible. One way in which to address this challenge is by enabling the reuse of (partial) DSLs in the construction of new DSLs. Reuse of DSLs builds on two foundations: a notion of DSL composition and theoretical results ensuring the safeness of composing DSLs with respect to the semantics of the component DSLs.
Given a graph-grammar formalisation of DSLs, in this paper, we build on graph transformation system morphisms to define parameterised DSLs and their instantiation by an amalgamation construction. Results on the protection of the behaviour along the induced morphisms allow us to safely reuse and combine definitions of DSLs to build more complex ones. We illustrate our proposal in e-Motions for a DSL for production-line systems and three independent DSLs for describing non-functional properties, namely response time, throughput, and failure rate.Comparing MapReduce and pipeline implementations for counting triangles
http://hdl.handle.net/2117/100579
Comparing MapReduce and pipeline implementations for counting triangles
Pasarella Sánchez, Ana Edelmira; Vidal Serodio, Maria Esther; Zoltan, Cristina
A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a
parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data
and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected
whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer
paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes.
An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting
triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.
2017-02-06T08:48:11ZPasarella Sánchez, Ana EdelmiraVidal Serodio, Maria EstherZoltan, CristinaA generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a
parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data
and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected
whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer
paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes.
An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting
triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.