Articles de revista
http://hdl.handle.net/2117/3093
20161026T21:46:38Z

Selftracking reloaded: Applying process mining to personalized health care from labeled sensor data
http://hdl.handle.net/2117/91090
Selftracking reloaded: Applying process mining to personalized health care from labeled sensor data
Sztyler, Timo; Carmona Vargas, Josep; Völker, Johanna; Stuckenschmidt, Heiner
Currently, there is a trend to promote personalized health care in order to prevent diseases or to have a healthier life. Using current devices such as smartphones and smartwatches, an individual can easily record detailed data from her daily life. Yet, this data has been mainly used for selftracking in order to enable personalized health care. In this paper, we provide ideas on how process mining can be used as a finegrained evolution of traditional selftracking. We have applied the ideas of the paper on recorded data from a set of individuals, and present conclusions and challenges.
20161026T09:21:42Z
Sztyler, Timo
Carmona Vargas, Josep
Völker, Johanna
Stuckenschmidt, Heiner
Currently, there is a trend to promote personalized health care in order to prevent diseases or to have a healthier life. Using current devices such as smartphones and smartwatches, an individual can easily record detailed data from her daily life. Yet, this data has been mainly used for selftracking in order to enable personalized health care. In this paper, we provide ideas on how process mining can be used as a finegrained evolution of traditional selftracking. We have applied the ideas of the paper on recorded data from a set of individuals, and present conclusions and challenges.

Mining conditional partial order graphs from event logs
http://hdl.handle.net/2117/91088
Mining conditional partial order graphs from event logs
Mokhov, Andrey; Carmona Vargas, Josep; Beaumont, Jonathan
Process mining techniques rely on event logs: the extraction of a process model (discovery) takes an event log as the input, the adequacy of a process model (conformance) is checked against an event log, and the enhancement of a process model is performed by using available data in the log. Several notations and formalisms for event log representation have been proposed in the recent years to enable efficient algorithms for the aforementioned process mining problems. In this paper we show how Conditional Partial Order Graphs (CPOGs), a recently introduced formalism for compact representation of families of partial orders, can be used in the process mining field, in particular for addressing the problem of compact and easytocomprehend representation of event logs with data. We present algorithms for extracting both the control flow as well as the relevant data parameters from a given event log and show how CPOGs can be used for efficient and effective visualisation of the obtained results. We demonstrate that the resulting representation can be used to reveal the hidden interplay between the control and data flows of a process, thereby opening way for new process mining techniques capable of exploiting this interplay. Finally, we present opensource software support and discuss current limitations of the proposed approach.
20161026T09:05:38Z
Mokhov, Andrey
Carmona Vargas, Josep
Beaumont, Jonathan
Process mining techniques rely on event logs: the extraction of a process model (discovery) takes an event log as the input, the adequacy of a process model (conformance) is checked against an event log, and the enhancement of a process model is performed by using available data in the log. Several notations and formalisms for event log representation have been proposed in the recent years to enable efficient algorithms for the aforementioned process mining problems. In this paper we show how Conditional Partial Order Graphs (CPOGs), a recently introduced formalism for compact representation of families of partial orders, can be used in the process mining field, in particular for addressing the problem of compact and easytocomprehend representation of event logs with data. We present algorithms for extracting both the control flow as well as the relevant data parameters from a given event log and show how CPOGs can be used for efficient and effective visualisation of the obtained results. We demonstrate that the resulting representation can be used to reveal the hidden interplay between the control and data flows of a process, thereby opening way for new process mining techniques capable of exploiting this interplay. Finally, we present opensource software support and discuss current limitations of the proposed approach.

A fast and retargetable framework for logicIPinternal electromigration assessment comprehending advanced waveform effects
http://hdl.handle.net/2117/90714
A fast and retargetable framework for logicIPinternal electromigration assessment comprehending advanced waveform effects
Jain, Palkesh; Cortadella Fortuny, Jordi; Sapatnekar, Sachin S.
A new methodology for systemonchiplevel logicIPinternal electromigration verification is presented in this paper, which significantly improves accuracy by comprehending the impact of the parasitic RC loading and voltagedependent pin capacitance in the library model. It additionally provides an onthefly retargeting capability for reliability constraints by allowing arbitrary specifications of lifetimes, temperatures, voltages, and failure rates, as well as interoperability of the IPs across foundries. The characterization part of the methodology is expedited through the intelligent IPresponse modeling. The ultimate benefit of the proposed approach is demonstrated on a 28nm design by providing an onthefly specification of retargeted reliability constraints. The results show a high correlation with SPICE and were obtained with an order of magnitude reduction in the verification runtime.
20161013T07:40:06Z
Jain, Palkesh
Cortadella Fortuny, Jordi
Sapatnekar, Sachin S.
A new methodology for systemonchiplevel logicIPinternal electromigration verification is presented in this paper, which significantly improves accuracy by comprehending the impact of the parasitic RC loading and voltagedependent pin capacitance in the library model. It additionally provides an onthefly retargeting capability for reliability constraints by allowing arbitrary specifications of lifetimes, temperatures, voltages, and failure rates, as well as interoperability of the IPs across foundries. The characterization part of the methodology is expedited through the intelligent IPresponse modeling. The ultimate benefit of the proposed approach is demonstrated on a 28nm design by providing an onthefly specification of retargeted reliability constraints. The results show a high correlation with SPICE and were obtained with an order of magnitude reduction in the verification runtime.

Complexity and dynamics of the winemaking bacterial communities in berries, musts, and wines from apulian grape cultivars through time and space
http://hdl.handle.net/2117/90157
Complexity and dynamics of the winemaking bacterial communities in berries, musts, and wines from apulian grape cultivars through time and space
Marzano, Marinella; Fosso, Bruno; Manzari, Caterina; Grieco, Francesco; Intranuovo, Marianna; Cozzi, Giuseppe; Mulè, Giuseppina; Scioscia, Gaetano; Valiente Feruglio, Gabriel Alejandro; Tullo, Apollonia; Sbisa, Elisabetta; Pesole, Graziano; Santamaria, Monica
Currently, there is very little information available regarding the microbiome associated with the wine production chain. Here, we used an amplicon sequencing approach based on highthroughput sequencing (HTS) to obtain a comprehensive assessment of the bacterial community associated with the production of three Apulian red wines, from grape to final product. The relationships among grape variety, the microbial community, and fermentation was investigated. Moreover, the winery microbiota was evaluated compared to the autochthonous species in vineyards that persist until the end of the winemaking process. The analysis highlighted the remarkable dynamics within the microbial communities during fermentation. A common microbial core shared among the examined wine varieties was observed, and the unique taxonomic signature of each wine appellation was revealed. New species belonging to the genus Halomonas were also reported. This study demonstrates the potential of this metagenomic approach, supported by optimized protocols, for identifying the biodiversity of the wine supply chain. The developed experimental pipeline offers new prospects for other research fields in which a comprehensive view of microbial community complexity and dynamics is desirable.
20160923T10:40:20Z
Marzano, Marinella
Fosso, Bruno
Manzari, Caterina
Grieco, Francesco
Intranuovo, Marianna
Cozzi, Giuseppe
Mulè, Giuseppina
Scioscia, Gaetano
Valiente Feruglio, Gabriel Alejandro
Tullo, Apollonia
Sbisa, Elisabetta
Pesole, Graziano
Santamaria, Monica
Currently, there is very little information available regarding the microbiome associated with the wine production chain. Here, we used an amplicon sequencing approach based on highthroughput sequencing (HTS) to obtain a comprehensive assessment of the bacterial community associated with the production of three Apulian red wines, from grape to final product. The relationships among grape variety, the microbial community, and fermentation was investigated. Moreover, the winery microbiota was evaluated compared to the autochthonous species in vineyards that persist until the end of the winemaking process. The analysis highlighted the remarkable dynamics within the microbial communities during fermentation. A common microbial core shared among the examined wine varieties was observed, and the unique taxonomic signature of each wine appellation was revealed. New species belonging to the genus Halomonas were also reported. This study demonstrates the potential of this metagenomic approach, supported by optimized protocols, for identifying the biodiversity of the wine supply chain. The developed experimental pipeline offers new prospects for other research fields in which a comprehensive view of microbial community complexity and dynamics is desirable.

Analysis of pivot sampling in dualpivot Quicksort: A holistic analysis of Yaroslavskiy's partitioning scheme
http://hdl.handle.net/2117/89895
Analysis of pivot sampling in dualpivot Quicksort: A holistic analysis of Yaroslavskiy's partitioning scheme
Nebel, Markus E.; Wild, Sebastian; Martínez Parra, Conrado
The new dualpivot Quicksort by Vladimir Yaroslavskiyused in Oracle's Java runtime library since version 7features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic singlepivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dualpivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy's algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy's algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy's algorithm in practice: compared with corresponding versions of classic singlepivot Quicksort, dualpivot Quicksort needs significantly less I/Os, both with and without pivot sampling.
The final publication is available at Springer via http://dx.doi.org/10.1007/s0045301500417
20160914T07:49:06Z
Nebel, Markus E.
Wild, Sebastian
Martínez Parra, Conrado
The new dualpivot Quicksort by Vladimir Yaroslavskiyused in Oracle's Java runtime library since version 7features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic singlepivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dualpivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy's algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy's algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy's algorithm in practice: compared with corresponding versions of classic singlepivot Quicksort, dualpivot Quicksort needs significantly less I/Os, both with and without pivot sampling.

On the cost of fixed partial match queries in Kd trees
http://hdl.handle.net/2117/89860
On the cost of fixed partial match queries in Kd trees
Duch Brown, Amalia; Lau LaynesLozada, Gustavo Salvador; Martínez Parra, Conrado
Partial match queries constitute the most basic type of associative queries in multidimensional data structures such as Kd trees or quadtrees. Given a query q=(q0,…,qK1) where s of the coordinates are specified and Ks are left unspecified (qi=*), a partial match search returns the subset of data points x=(x0,…,xK1) in the data structure that match the given query, that is, the data points such that xi=qi whenever qi¿*. There exists a wealth of results about the cost of partial match searches in many different multidimensional data structures, but most of these results deal with random queries. Only recently a few papers have begun to investigate the cost of partial match queries with a fixed query q. This paper represents a new contribution in this direction, giving a detailed asymptotic estimate of the expected cost Pn,q for a given fixed query q. From previous results on the cost of partial matches with a fixed query and the ones presented here, a deeper understanding is emerging, uncovering the following functional shape for Pn,q
Pn,q=¿·(¿i:qi is specifiedqi(1qi))a/2·na+l.o.t.
(l.o.t. lower order terms, throughout this work) in many multidimensional data structures, which differ only in the exponent a and the constant ¿, both dependent on s and K, and, for some data structures, on the whole pattern of specified and unspecified coordinates in q as well. Although it is tempting to conjecture that this functional shape is “universal”, we have shown experimentally that it seems not to be true for a variant of Kd trees called squarish Kd trees.
The final publication is available at Springer via http://dx.doi.org/10.1007/s0045301500974
20160913T10:21:11Z
Duch Brown, Amalia
Lau LaynesLozada, Gustavo Salvador
Martínez Parra, Conrado
Partial match queries constitute the most basic type of associative queries in multidimensional data structures such as Kd trees or quadtrees. Given a query q=(q0,…,qK1) where s of the coordinates are specified and Ks are left unspecified (qi=*), a partial match search returns the subset of data points x=(x0,…,xK1) in the data structure that match the given query, that is, the data points such that xi=qi whenever qi¿*. There exists a wealth of results about the cost of partial match searches in many different multidimensional data structures, but most of these results deal with random queries. Only recently a few papers have begun to investigate the cost of partial match queries with a fixed query q. This paper represents a new contribution in this direction, giving a detailed asymptotic estimate of the expected cost Pn,q for a given fixed query q. From previous results on the cost of partial matches with a fixed query and the ones presented here, a deeper understanding is emerging, uncovering the following functional shape for Pn,q
Pn,q=¿·(¿i:qi is specifiedqi(1qi))a/2·na+l.o.t.
(l.o.t. lower order terms, throughout this work) in many multidimensional data structures, which differ only in the exponent a and the constant ¿, both dependent on s and K, and, for some data structures, on the whole pattern of specified and unspecified coordinates in q as well. Although it is tempting to conjecture that this functional shape is “universal”, we have shown experimentally that it seems not to be true for a variant of Kd trees called squarish Kd trees.

Absorption time of the Moran process
http://hdl.handle.net/2117/89367
Absorption time of the Moran process
Díaz Cort, Josep; Goldberg, Leslie Ann; Richerby, David; Serna Iglesias, María José
© 2016 Wiley Periodicals, Inc.
The Moran process models the spread of mutations in populations on graphs. We investigate the absorption time of the process, which is the time taken for a mutation introduced at a randomly chosen vertex to either spread to the whole population, or to become extinct. It is known that the expected absorption time for an advantageous mutation is O(n4) on an nvertex undirected graph, which allows the behaviour of the process on undirected graphs to be analysed using the Markov chain Monte Carlo method. We show that this does not extend to directed graphs by exhibiting an infinite family of directed graphs for which the expected absorption time is exponential in the number of vertices. However, for regular directed graphs, we show that the expected absorption time is O(nlogn) and O(n2). We exhibit families of graphs matching these bounds and give improved bounds for other families of graphs, based on isoperimetric number. Our results are obtained via stochastic dominations which we demonstrate by establishing a coupling in a related continuoustime model. The coupling also implies several natural domination results regarding the fixation probability of the original (discretetime) process, resolving a conjecture of Shakarian, Roos and Johnson.
20160729T13:05:23Z
Díaz Cort, Josep
Goldberg, Leslie Ann
Richerby, David
Serna Iglesias, María José
© 2016 Wiley Periodicals, Inc.
The Moran process models the spread of mutations in populations on graphs. We investigate the absorption time of the process, which is the time taken for a mutation introduced at a randomly chosen vertex to either spread to the whole population, or to become extinct. It is known that the expected absorption time for an advantageous mutation is O(n4) on an nvertex undirected graph, which allows the behaviour of the process on undirected graphs to be analysed using the Markov chain Monte Carlo method. We show that this does not extend to directed graphs by exhibiting an infinite family of directed graphs for which the expected absorption time is exponential in the number of vertices. However, for regular directed graphs, we show that the expected absorption time is O(nlogn) and O(n2). We exhibit families of graphs matching these bounds and give improved bounds for other families of graphs, based on isoperimetric number. Our results are obtained via stochastic dominations which we demonstrate by establishing a coupling in a related continuoustime model. The coupling also implies several natural domination results regarding the fixation probability of the original (discretetime) process, resolving a conjecture of Shakarian, Roos and Johnson.

On the complexity of exchanging
http://hdl.handle.net/2117/86068
On the complexity of exchanging
Molinero Albareda, Xavier; Olsen, Martin; Serna Iglesias, María José
We analyze the computational complexity of the problem of deciding whether, for a given simple game, there exists the possibility of rearranging the participants in a set of j given losing coalitions into a set of j winning coalitions. We also look at the problem of turning winning coalitions into losing coalitions. We analyze the problem when the simple game is represented by a list of wining, losing, minimal winning or maximal loosing coalitions.
20160421T13:57:53Z
Molinero Albareda, Xavier
Olsen, Martin
Serna Iglesias, María José
We analyze the computational complexity of the problem of deciding whether, for a given simple game, there exists the possibility of rearranging the participants in a set of j given losing coalitions into a set of j winning coalitions. We also look at the problem of turning winning coalitions into losing coalitions. We analyze the problem when the simple game is represented by a list of wining, losing, minimal winning or maximal loosing coalitions.

Areaefficient snoopyaware NoC design for highperformance chip multiprocessor systems
http://hdl.handle.net/2117/85389
Areaefficient snoopyaware NoC design for highperformance chip multiprocessor systems
Roca Pérez, Antoni; Hernández Gañán, Carlos; Lodde, Mario; Flich Cardo, José
Manycore CMP systems are expected to grow to tens or even hundreds of cores. In this paper we show that the effective codesign of both, the networkonchip and the coherence protocol, improves performance and power meanwhile total area resources remain bounded. We propose a snoopyaware networkonchip topology made of two meshoftree topologies. Reducing the complexity of the coherence protocol  and hence its resources  and moving this complexity to the network, leads to a global decrease in power consumption meanwhile area is barely affected. Benefits of our proposal are due to the highthroughput and low delay of the network, but also due to the simplicity of the coherence protocol. The proposed network and protocol minimizes communication amongst cores when compared to traditional solutions based either on 2Dmesh topologies or in directorybased protocols.
20160408T07:51:16Z
Roca Pérez, Antoni
Hernández Gañán, Carlos
Lodde, Mario
Flich Cardo, José
Manycore CMP systems are expected to grow to tens or even hundreds of cores. In this paper we show that the effective codesign of both, the networkonchip and the coherence protocol, improves performance and power meanwhile total area resources remain bounded. We propose a snoopyaware networkonchip topology made of two meshoftree topologies. Reducing the complexity of the coherence protocol  and hence its resources  and moving this complexity to the network, leads to a global decrease in power consumption meanwhile area is barely affected. Benefits of our proposal are due to the highthroughput and low delay of the network, but also due to the simplicity of the coherence protocol. The proposed network and protocol minimizes communication amongst cores when compared to traditional solutions based either on 2Dmesh topologies or in directorybased protocols.

Clustering media items stemming from multiple social networks
http://hdl.handle.net/2117/84996
Clustering media items stemming from multiple social networks
Steiner, Thomas; Verborgh, Ruben; Gabarró Vallès, Joaquim; Mannens, Erik; Van de Walle, Rik
We have created and evaluated an algorithm capable of deduplicating and clustering exact and nearduplicate media items of type photo and video that get shared on multiple social networks in the context of events. This algorithm works in an entirely ad hoc manner without requiring any precalculation. When people attend events, they more and more share eventrelated media items publicly on social networks to let their social network contacts relive and witness the attended events. In the past, we have worked on methods to accumulate such public usergenerated multimedia content in order to summarize events visually, for example, in the form of media galleries or slideshows. In this paper, first, we introduce socialnetworkspecific reasons and challenges that cause nearduplicate media items. Second, we detail an algorithm for the task of deduplicating and clustering exact and nearduplicate media items stemming from multiple social networks. Finally, we evaluate the algorithm's strengths and weaknesses and thoroughly compare its performance with the stateoftheart feature detection algorithms SIFT, ASIFT and SURF and show that for the given use case it performs almost equally well accuracywise, but strongly outperforms speedwise.
20160331T14:07:07Z
Steiner, Thomas
Verborgh, Ruben
Gabarró Vallès, Joaquim
Mannens, Erik
Van de Walle, Rik
We have created and evaluated an algorithm capable of deduplicating and clustering exact and nearduplicate media items of type photo and video that get shared on multiple social networks in the context of events. This algorithm works in an entirely ad hoc manner without requiring any precalculation. When people attend events, they more and more share eventrelated media items publicly on social networks to let their social network contacts relive and witness the attended events. In the past, we have worked on methods to accumulate such public usergenerated multimedia content in order to summarize events visually, for example, in the form of media galleries or slideshows. In this paper, first, we introduce socialnetworkspecific reasons and challenges that cause nearduplicate media items. Second, we detail an algorithm for the task of deduplicating and clustering exact and nearduplicate media items stemming from multiple social networks. Finally, we evaluate the algorithm's strengths and weaknesses and thoroughly compare its performance with the stateoftheart feature detection algorithms SIFT, ASIFT and SURF and show that for the given use case it performs almost equally well accuracywise, but strongly outperforms speedwise.