ALBCOM - Algorismia, Bioinformàtica, Complexitat i Mètodes Formals
http://hdl.handle.net/2117/3092
2017-08-23T14:11:40ZMetaShot: an accurate workflow for taxon classification of host-associated microbiome from shotgun metagenomic data
http://hdl.handle.net/2117/106077
MetaShot: an accurate workflow for taxon classification of host-associated microbiome from shotgun metagenomic data
Fosso, Bruno; Santamaria, Monica; D'Antonio, M.; Lovero, D.; Corrado, G.; Vizza, E.; Passaro, N.; Garbuglia, A.R.; Capobianchi, M.R.; Crescenzi, M.; Valiente Feruglio, Gabriel Alejandro; Pesole, Graziano
Shotgun metagenomics by high-throughput sequencing may allow deep and accurate characterization of host-associated total microbiomes, including bacteria, viruses, protists and fungi. However, the analysis of such sequencing data is still extremely challenging in terms of both overall accuracy and computational efficiency, and current methodologies show substantial variability in misclassification rate and resolution at lower taxonomic ranks or are limited to specific life domains (e.g. only bacteria). We present here MetaShot, a workflow for assessing the total microbiome composition from host-associated shotgun sequence data, and show its overall optimal accuracy performance by analyzing both simulated and real datasets.
2017-07-03T07:48:29ZFosso, BrunoSantamaria, MonicaD'Antonio, M.Lovero, D.Corrado, G.Vizza, E.Passaro, N.Garbuglia, A.R.Capobianchi, M.R.Crescenzi, M.Valiente Feruglio, Gabriel AlejandroPesole, GrazianoShotgun metagenomics by high-throughput sequencing may allow deep and accurate characterization of host-associated total microbiomes, including bacteria, viruses, protists and fungi. However, the analysis of such sequencing data is still extremely challenging in terms of both overall accuracy and computational efficiency, and current methodologies show substantial variability in misclassification rate and resolution at lower taxonomic ranks or are limited to specific life domains (e.g. only bacteria). We present here MetaShot, a workflow for assessing the total microbiome composition from host-associated shotgun sequence data, and show its overall optimal accuracy performance by analyzing both simulated and real datasets.Parallel algorithms for two processors precedence constraint scheduling
http://hdl.handle.net/2117/104935
Parallel algorithms for two processors precedence constraint scheduling
Serna Iglesias, María José
The final publication is available at link.springer.com
2017-05-26T14:27:35ZSerna Iglesias, María JoséRandomized parallel approximations to max flow
http://hdl.handle.net/2117/104934
Randomized parallel approximations to max flow
Serna Iglesias, María José
The final publication is available at link.springer.com
2017-05-26T14:21:33ZSerna Iglesias, María JoséLarge neighborhood search for the most strings with few bad columns problem
http://hdl.handle.net/2117/104908
Large neighborhood search for the most strings with few bad columns problem
Lizárraga Olivas, Evelia; Blesa Aguilera, Maria Josep; Blum, Christian; Raidl, Günther
In this work, we consider the following NP-hard combinatorial optimization problem from computational biology. Given a set of input strings of equal length, the goal is to identify a maximum cardinality subset of strings that differ maximally in a pre-defined number of positions. First of all, we introduce an integer linear programming model for this problem. Second, two variants of a rather simple greedy strategy are proposed. Finally, a large neighborhood search algorithm is presented. A comprehensive experimental comparison among the proposed techniques shows, first, that larger neighborhood search generally outperforms both greedy strategies. Second, while large neighborhood search shows to be competitive with the stand-alone application of CPLEX for small- and medium-sized problem instances, it outperforms CPLEX in the context of larger instances.
2017-05-26T10:29:18ZLizárraga Olivas, EveliaBlesa Aguilera, Maria JosepBlum, ChristianRaidl, GüntherIn this work, we consider the following NP-hard combinatorial optimization problem from computational biology. Given a set of input strings of equal length, the goal is to identify a maximum cardinality subset of strings that differ maximally in a pre-defined number of positions. First of all, we introduce an integer linear programming model for this problem. Second, two variants of a rather simple greedy strategy are proposed. Finally, a large neighborhood search algorithm is presented. A comprehensive experimental comparison among the proposed techniques shows, first, that larger neighborhood search generally outperforms both greedy strategies. Second, while large neighborhood search shows to be competitive with the stand-alone application of CPLEX for small- and medium-sized problem instances, it outperforms CPLEX in the context of larger instances.Non recursive functions have transcendental generating functions
http://hdl.handle.net/2117/104671
Non recursive functions have transcendental generating functions
Cucker Farkas, Juan Felipe; Gabarró Vallès, Joaquim
Proves that nonprimitive recursive functions have transcendental generating series. This result translates a certain measure of the complexity of a function, the fact of not being primitive recursive, into another measure of the complexity of the generating series associated to the function, the fact of being transcendental.; On démontre que les fonctions qui ne sont pas recursives primitives ont des séries génératrices transcendantes. Ce résultat traduit une certaine mesure de complexité d'une fonction, le fait de ne pas être recursive primitive, dans une autre mesure de la complexité de la série génératrice associée à cette fonction, le fait d'être transcendante.
2017-05-22T09:14:24ZCucker Farkas, Juan FelipeGabarró Vallès, JoaquimProves that nonprimitive recursive functions have transcendental generating series. This result translates a certain measure of the complexity of a function, the fact of not being primitive recursive, into another measure of the complexity of the generating series associated to the function, the fact of being transcendental.
On démontre que les fonctions qui ne sont pas recursives primitives ont des séries génératrices transcendantes. Ce résultat traduit une certaine mesure de complexité d'une fonction, le fait de ne pas être recursive primitive, dans une autre mesure de la complexité de la série génératrice associée à cette fonction, le fait d'être transcendante.Nonuniform complexity classes specified by lower and upper bounds
http://hdl.handle.net/2117/104347
Nonuniform complexity classes specified by lower and upper bounds
Balcázar Navarro, José Luis; Gabarró Vallès, Joaquim
We characterize in terms of oracle Turing machines the classes defined by exponential lower bounds on some nonuniform complexity measures. After, we use the same methods to giue a new characterization of classes defined by polynomial and polylog upper bounds, obtaining an unified approach to deal with upper and lower bounds, The main measures are the initial index, the context-free cosU ond the boolean circuits size. We interpret our results by discussing a trade- off between oracle information and computed information for oracle Turing machines.; NOMS caractérisons en termes de machines de Turing avec oracles les classes définies par des bornes inférieures exponentielles pour des mesures de complexité non uniformes. Nous utilisons ensuite les mêmes méthodes pour donner une nouvelle caractérisation des classes définies par des bornes supérieures polynomiales et polylogarithmiques, obtenanrainsi une approche unifiée pour les bornes inférieures et supérieures. Les mesures principales sont F index initial, le coût grammatical et la taille des circuits booléens. Nous interprétons nos résultats en étudiant, pour les machines de Turing avec oracle, la relation entre l'information due à Voracle et l'information calculée par la machine.
2017-05-12T08:15:24ZBalcázar Navarro, José LuisGabarró Vallès, JoaquimWe characterize in terms of oracle Turing machines the classes defined by exponential lower bounds on some nonuniform complexity measures. After, we use the same methods to giue a new characterization of classes defined by polynomial and polylog upper bounds, obtaining an unified approach to deal with upper and lower bounds, The main measures are the initial index, the context-free cosU ond the boolean circuits size. We interpret our results by discussing a trade- off between oracle information and computed information for oracle Turing machines.
NOMS caractérisons en termes de machines de Turing avec oracles les classes définies par des bornes inférieures exponentielles pour des mesures de complexité non uniformes. Nous utilisons ensuite les mêmes méthodes pour donner une nouvelle caractérisation des classes définies par des bornes supérieures polynomiales et polylogarithmiques, obtenanrainsi une approche unifiée pour les bornes inférieures et supérieures. Les mesures principales sont F index initial, le coût grammatical et la taille des circuits booléens. Nous interprétons nos résultats en étudiant, pour les machines de Turing avec oracle, la relation entre l'information due à Voracle et l'information calculée par la machine.Anti-alignments in conformance checking: the dark side of process models
http://hdl.handle.net/2117/103958
Anti-alignments in conformance checking: the dark side of process models
Chatain, Thomas; Carmona Vargas, Josep
Conformance checking techniques asses the suitability of a process model in representing an underlying process, observed through a collection of real executions. These techniques suffer from the wellknown state space explosion problem, hence handling process models exhibiting large or even infinite state spaces remains a challenge. One important metric in conformance checking is to asses the precision of the model with respect to the observed executions, i.e., characterize the ability of the model to produce behavior unrelated to the one observed. By avoiding the computation of the full state space of a model, current techniques only provide estimations of the precision metric, which in some situations tend to be very optimistic, thus hiding real problems a process model may have. In this paper we present the notion of antialignment as a concept to help unveiling traces in the model that may deviate significantly from the observed behavior. Using anti-alignments, current estimations can be improved, e.g., in precision checking. We show how to express the problem of finding anti-alignments as the satisfiability of a Boolean formula, and provide a tool which can deal with large models efficiently.
2017-05-03T11:38:28ZChatain, ThomasCarmona Vargas, JosepConformance checking techniques asses the suitability of a process model in representing an underlying process, observed through a collection of real executions. These techniques suffer from the wellknown state space explosion problem, hence handling process models exhibiting large or even infinite state spaces remains a challenge. One important metric in conformance checking is to asses the precision of the model with respect to the observed executions, i.e., characterize the ability of the model to produce behavior unrelated to the one observed. By avoiding the computation of the full state space of a model, current techniques only provide estimations of the precision metric, which in some situations tend to be very optimistic, thus hiding real problems a process model may have. In this paper we present the notion of antialignment as a concept to help unveiling traces in the model that may deviate significantly from the observed behavior. Using anti-alignments, current estimations can be improved, e.g., in precision checking. We show how to express the problem of finding anti-alignments as the satisfiability of a Boolean formula, and provide a tool which can deal with large models efficiently.Trade-offs between time and memory in a tighter model of CDCL SAT solvers
http://hdl.handle.net/2117/103872
Trade-offs between time and memory in a tighter model of CDCL SAT solvers
Elffers, J.; Johannsen, Jan; Lauria, Massimo; Magnard, Thomas; Nordström, Jakob; Vinyals, Marc
A long line of research has studied the power of conflict- driven clause learning (CDCL) and how it compares to the resolution proof system in which it searches for proofs. It has been shown that CDCL can polynomially simulate resolution even with an adversarially chosen learning scheme as long as it is asserting. However, the simulation only works under the assumption that no learned clauses are ever forgot- ten, and the polynomial blow-up is significant. Moreover, the simulation requires very frequent restarts, whereas the power of CDCL with less frequent or entirely without restarts remains poorly understood. With a view towards obtaining results with tighter relations between CDCL and resolution, we introduce a more fine-grained model of CDCL that cap- tures not only time but also memory usage and number of restarts. We show how previously established strong size-space trade-offs for resolution can be transformed into equally strong trade-offs between time and memory usage for CDCL, where the upper bounds hold for CDCL with- out any restarts using the standard 1UIP clause learning scheme, and the (in some cases tightly matching) lower bounds hold for arbitrarily frequent restarts and arbitrary clause learning schemes.
"The final publication is available at http://link.springer.com/chapter/10.1007/978-3-319-40970-2_11
2017-04-28T20:33:22ZElffers, J.Johannsen, JanLauria, MassimoMagnard, ThomasNordström, JakobVinyals, MarcA long line of research has studied the power of conflict- driven clause learning (CDCL) and how it compares to the resolution proof system in which it searches for proofs. It has been shown that CDCL can polynomially simulate resolution even with an adversarially chosen learning scheme as long as it is asserting. However, the simulation only works under the assumption that no learned clauses are ever forgot- ten, and the polynomial blow-up is significant. Moreover, the simulation requires very frequent restarts, whereas the power of CDCL with less frequent or entirely without restarts remains poorly understood. With a view towards obtaining results with tighter relations between CDCL and resolution, we introduce a more fine-grained model of CDCL that cap- tures not only time but also memory usage and number of restarts. We show how previously established strong size-space trade-offs for resolution can be transformed into equally strong trade-offs between time and memory usage for CDCL, where the upper bounds hold for CDCL with- out any restarts using the standard 1UIP clause learning scheme, and the (in some cases tightly matching) lower bounds hold for arbitrarily frequent restarts and arbitrary clause learning schemes.The complexity of testing properties of simple games
http://hdl.handle.net/2117/103171
The complexity of testing properties of simple games
Freixas Bosch, Josep; Molinero Albareda, Xavier; Olsen, Martin; Serna Iglesias, María José
Simple games cover voting systems in which a single alternative, such as a bill or an amendment, is pitted against the status quo. A simple game or a yes-no voting system is a set of rules that specifies exactly which collections of ``yea'' votes yield passage of the issue at hand. A collection of ``yea'' voters forms a winning coalition.
We are interested on performing a complexity analysis of problems on such games depending on the game representation. We consider four natural explicit representations, winning, loosing, minimal winning, and maximal loosing. We first analyze the computational complexity of obtaining a particular representation of a simple game from a different one. We show that some cases this transformation can be done in polynomial time while the others require exponential time. The second question is classifying the complexity for testing whether a game is simple or weighted. We show that for the four types of representation both problem can be solved in polynomial time. Finally, we provide results on the complexity of testing whether a simple game or a weighted game is of a special type. In this way, we analyze strongness, properness, decisiveness and homogeneity, which are desirable properties to be fulfilled for a simple game.
2017-03-31T15:48:07ZFreixas Bosch, JosepMolinero Albareda, XavierOlsen, MartinSerna Iglesias, María JoséSimple games cover voting systems in which a single alternative, such as a bill or an amendment, is pitted against the status quo. A simple game or a yes-no voting system is a set of rules that specifies exactly which collections of ``yea'' votes yield passage of the issue at hand. A collection of ``yea'' voters forms a winning coalition.
We are interested on performing a complexity analysis of problems on such games depending on the game representation. We consider four natural explicit representations, winning, loosing, minimal winning, and maximal loosing. We first analyze the computational complexity of obtaining a particular representation of a simple game from a different one. We show that some cases this transformation can be done in polynomial time while the others require exponential time. The second question is classifying the complexity for testing whether a game is simple or weighted. We show that for the four types of representation both problem can be solved in polynomial time. Finally, we provide results on the complexity of testing whether a simple game or a weighted game is of a special type. In this way, we analyze strongness, properness, decisiveness and homogeneity, which are desirable properties to be fulfilled for a simple game.Computing alignments with constraint programming : the acyclic case
http://hdl.handle.net/2117/103063
Computing alignments with constraint programming : the acyclic case
Borrego, Diana; Gómez López, María Teresa; Carmona Vargas, Josep; Martínez Gasca, Rafael
Conformance checking confronts process models with real process executions to detect and measure deviations between modelled and observed behaviour. The core technique for conformance checking is the computation of an alignment. Current approaches for alignment computation rely on a shortest-path technique over the product of the state-space of a model and the observed trace, thus suffering from the well-known state explosion problem. This paper presents a fresh alternative for alignment computation of acyclic process models, that encodes the alignment problem as a Constraint Satisfaction Problem. Since modern solvers for this framework are capable of dealing with large instances, this contribution has a clear potential. Remarkably, our prototype implementation can handle instances that represent a real challenge for current techniques. Main advantages of using Constraint Programming paradigm lie in the possibility to adapt parameters such as the maximum search time, or the maximum misalignment allowed. Moreover, using search and propagation algorithms incorporated in Constraint Programming Solvers permits to find solutions for problems unsolvable with other techniques.
2017-03-30T06:45:05ZBorrego, DianaGómez López, María TeresaCarmona Vargas, JosepMartínez Gasca, RafaelConformance checking confronts process models with real process executions to detect and measure deviations between modelled and observed behaviour. The core technique for conformance checking is the computation of an alignment. Current approaches for alignment computation rely on a shortest-path technique over the product of the state-space of a model and the observed trace, thus suffering from the well-known state explosion problem. This paper presents a fresh alternative for alignment computation of acyclic process models, that encodes the alignment problem as a Constraint Satisfaction Problem. Since modern solvers for this framework are capable of dealing with large instances, this contribution has a clear potential. Remarkably, our prototype implementation can handle instances that represent a real challenge for current techniques. Main advantages of using Constraint Programming paradigm lie in the possibility to adapt parameters such as the maximum search time, or the maximum misalignment allowed. Moreover, using search and propagation algorithms incorporated in Constraint Programming Solvers permits to find solutions for problems unsolvable with other techniques.