LARCA  Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
http://hdl.handle.net/2117/3486
20161028T19:44:52Z

Adaptive scheduling on poweraware managed datacenters using machine learning
http://hdl.handle.net/2117/91201
Adaptive scheduling on poweraware managed datacenters using machine learning
Berral García, Josep Lluís; Gavaldà Mestre, Ricard; Torres Viñals, Jordi
Energyrelated costs have become one of the major economic factors in IT datacenters, and companies and the research community are currently working on new efficient poweraware resource management strategies, also known as “Green IT”. Here we propose a framework for autonomic scheduling of tasks and webservices on cloud environments, optimizing the profit taking into account revenue for task execution minus penalties for servicelevel agreement violations, minus power consumption cost. The principal contribution is the combination of consolidation and virtualization technologies, mathematical optimization methods, and machine learning techniques. The datacenter infrastructure, tasks to execute, and desired profit are cast as a mathematical programming model, which can then be solved in a different ways to find good task schedulings. We use an exact solver based on mixed linear programming as a proof of concept but, since it is an NPcomplete problem, we show that approximate solvers provide valid alternatives for finding approximately optimal schedules. The machine learning is used to estimate the initially unknown parameters of the mathematical model. In particular, we need to predict a priori resource usage (such as CPU consumption) by different tasks under current workloads, and estimate task servicelevelagreement (such as response time) given workload features, host characteristics, and contention among tasks in the same host. Experiments show that machine learning algorithms can predict system behavior with acceptable accuracy, and that their combination with the exact or approximate schedulers manages to allocate tasks to hosts striking a balance between revenue for executed tasks, quality of service, and power consumption.
20161028T09:00:09Z
Berral García, Josep Lluís
Gavaldà Mestre, Ricard
Torres Viñals, Jordi
Energyrelated costs have become one of the major economic factors in IT datacenters, and companies and the research community are currently working on new efficient poweraware resource management strategies, also known as “Green IT”. Here we propose a framework for autonomic scheduling of tasks and webservices on cloud environments, optimizing the profit taking into account revenue for task execution minus penalties for servicelevel agreement violations, minus power consumption cost. The principal contribution is the combination of consolidation and virtualization technologies, mathematical optimization methods, and machine learning techniques. The datacenter infrastructure, tasks to execute, and desired profit are cast as a mathematical programming model, which can then be solved in a different ways to find good task schedulings. We use an exact solver based on mixed linear programming as a proof of concept but, since it is an NPcomplete problem, we show that approximate solvers provide valid alternatives for finding approximately optimal schedules. The machine learning is used to estimate the initially unknown parameters of the mathematical model. In particular, we need to predict a priori resource usage (such as CPU consumption) by different tasks under current workloads, and estimate task servicelevelagreement (such as response time) given workload features, host characteristics, and contention among tasks in the same host. Experiments show that machine learning algorithms can predict system behavior with acceptable accuracy, and that their combination with the exact or approximate schedulers manages to allocate tasks to hosts striking a balance between revenue for executed tasks, quality of service, and power consumption.

"Living in Barcelona" LiBCN workload 2010
http://hdl.handle.net/2117/91188
"Living in Barcelona" LiBCN workload 2010
Berral García, Josep Lluís; Gavaldà Mestre, Ricard; Torres Viñals, Jordi
Nowadays lots of Internet users are clients of web hosting companies, willing to offer their web services, store their content, or just publish their web sites on the network. This has made the hosting companies to use big datacenters or just the Cloud, in order to serve a web server, domain names, disk space and bandwidth to this great demand. In hosting companies, customers are often big companies or just private users or small business wanting to offer a web service or publish a website. Here we present and detail workloads from a set of different real web sites, of different owners and with different kind of content or offered web services. Some of them are personal or professional weblog sites, also small eCommerce sites, file storage/support sites, and information panel sites. The presented workload brings pieces of loads, that compared with
20161028T07:17:08Z
Berral García, Josep Lluís
Gavaldà Mestre, Ricard
Torres Viñals, Jordi
Nowadays lots of Internet users are clients of web hosting companies, willing to offer their web services, store their content, or just publish their web sites on the network. This has made the hosting companies to use big datacenters or just the Cloud, in order to serve a web server, domain names, disk space and bandwidth to this great demand. In hosting companies, customers are often big companies or just private users or small business wanting to offer a web service or publish a website. Here we present and detail workloads from a set of different real web sites, of different owners and with different kind of content or offered web services. Some of them are personal or professional weblog sites, also small eCommerce sites, file storage/support sites, and information panel sites. The presented workload brings pieces of loads, that compared with

Overtly anaphoric control in type logical grammar
http://hdl.handle.net/2117/90900
Overtly anaphoric control in type logical grammar
Corbalán, María Inés; Morrill, Glyn
In this paper we analyse anaphoric pronouns in control sentences and we investigate the implications of these kinds of sentences in relation to the Propositional Theory versus Property Theory question. For these purposes, we invoke the categorial calculus with limited contraction, a conservative extension of Lambek calculus that builds contraction into the logical rules for a customized slash typeconstructor.
20161020T06:36:32Z
Corbalán, María Inés
Morrill, Glyn
In this paper we analyse anaphoric pronouns in control sentences and we investigate the implications of these kinds of sentences in relation to the Propositional Theory versus Property Theory question. For these purposes, we invoke the categorial calculus with limited contraction, a conservative extension of Lambek calculus that builds contraction into the logical rules for a customized slash typeconstructor.

characterization of orderlike dependencies with formal concept analysis
http://hdl.handle.net/2117/90752
characterization of orderlike dependencies with formal concept analysis
Baixeries i Juvillà, Jaume; Napoli, Amedeo; Kaytoue, Mehdi; Codecedo, Victor
Functional Dependencies (FDs) play a key role in many fields of the relational database model, one of the most widely used database systems. FDs have also been applied in data analysis, data quality, knowledge discovery and the like, but in a very limited scope, because of their fixed semantics. To overcome this limitation, many generalizations have been defined to relax the crisp definition of FDs. FDs and a few of their generalizations have been characterized with Formal Concept Analysis which reveals itself to be an interesting unified framework for characterizing dependencies, that is, understanding and computing them in a formal way. In this paper, we extend this work by taking into account orderlike dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not simply an equality relation as with standard FDs
20161013T14:42:16Z
Baixeries i Juvillà, Jaume
Napoli, Amedeo
Kaytoue, Mehdi
Codecedo, Victor
Functional Dependencies (FDs) play a key role in many fields of the relational database model, one of the most widely used database systems. FDs have also been applied in data analysis, data quality, knowledge discovery and the like, but in a very limited scope, because of their fixed semantics. To overcome this limitation, many generalizations have been defined to relax the crisp definition of FDs. FDs and a few of their generalizations have been characterized with Formal Concept Analysis which reveals itself to be an interesting unified framework for characterizing dependencies, that is, understanding and computing them in a formal way. In this paper, we extend this work by taking into account orderlike dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not simply an equality relation as with standard FDs

On graph combinatorics to improve eigenvectorbased measures of centrality in directed networks
http://hdl.handle.net/2117/90335
On graph combinatorics to improve eigenvectorbased measures of centrality in directed networks
Arratia Quesada, Argimiro Alejandro; Marijuan López, Carlos
We present a combinatorial study on the rearrangement of links in the structure of directed networks for the purpose of improving the valuation of a vertex or group of vertices as established by an eigenvectorbased centrality measure. We build our topological classification starting from unidirectional rooted trees and up to more complex hierarchical structures such as acyclic digraphs, bidirectional and cyclical rooted trees (obtained by closing cycles on unidirectional trees). We analyze different modifications on the structure of these networks and study their effect on the valuation given by the eigenvectorbased scoring functions, with particular focus on alphacentrality and PageRank.
© 2016. This version is made available under the CCBYNCND 4.0 license http://creativecommons.org/licenses/byncnd/4.0/
20160929T14:12:42Z
Arratia Quesada, Argimiro Alejandro
Marijuan López, Carlos
We present a combinatorial study on the rearrangement of links in the structure of directed networks for the purpose of improving the valuation of a vertex or group of vertices as established by an eigenvectorbased centrality measure. We build our topological classification starting from unidirectional rooted trees and up to more complex hierarchical structures such as acyclic digraphs, bidirectional and cyclical rooted trees (obtained by closing cycles on unidirectional trees). We analyze different modifications on the structure of these networks and study their effect on the valuation given by the eigenvectorbased scoring functions, with particular focus on alphacentrality and PageRank.

De Menos a Distinto: Estudio de la Implantación de R en las asignaturas del grado de estadística
http://hdl.handle.net/2117/89582
De Menos a Distinto: Estudio de la Implantación de R en las asignaturas del grado de estadística
Baixeries i Juvillà, Jaume; Fairén González, Marta; Gabarró Vallès, Joaquim; Pasarella Sánchez, Ana Edelmira
Teaching computer science in degrees that are not computer science related presents an important challenge: to motivate the students and to achieve good average grades. The student’s complaint is always based on his lack of motivation: What is this subject useful for? and this is specially relevant when this subject is not easy to learn by the student. In this paper we show the case of computer courses in the Statistics degree taught in the Universitat de Barcelona (UB) and the Universitat Politècnica de Catalunya (UPC) (two Catalan universities). We initially tried to reduce the complexity of their contents in order to obtain better average grades. Yet, it did not workout as expected. Therefore, we changed our strategy and instead of making the contents easier (less complex), we changed the tools that were used to teach and tried to adapt them to the students’ interests. In this particular case, we decided to use the R programming language, a language widely used by statisticians, in order to explain the basics of programming. Therefore, we changed our strategy from less (simpler contents) to different (more elaborated and nontrivial contents adapted to meet their expectations).
20160905T15:19:46Z
Baixeries i Juvillà, Jaume
Fairén González, Marta
Gabarró Vallès, Joaquim
Pasarella Sánchez, Ana Edelmira
Teaching computer science in degrees that are not computer science related presents an important challenge: to motivate the students and to achieve good average grades. The student’s complaint is always based on his lack of motivation: What is this subject useful for? and this is specially relevant when this subject is not easy to learn by the student. In this paper we show the case of computer courses in the Statistics degree taught in the Universitat de Barcelona (UB) and the Universitat Politècnica de Catalunya (UPC) (two Catalan universities). We initially tried to reduce the complexity of their contents in order to obtain better average grades. Yet, it did not workout as expected. Therefore, we changed our strategy and instead of making the contents easier (less complex), we changed the tools that were used to teach and tried to adapt them to the students’ interests. In this particular case, we decided to use the R programming language, a language widely used by statisticians, in order to explain the basics of programming. Therefore, we changed our strategy from less (simpler contents) to different (more elaborated and nontrivial contents adapted to meet their expectations).

Gelada vocal sequences follow Menzerath's linguistic law
http://hdl.handle.net/2117/89435
Gelada vocal sequences follow Menzerath's linguistic law
Gustison, Morgan; Semple, Stuart; Ferrer Cancho, Ramon; Bergman, Thore
Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath's law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath's law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath's law reflects compressionthe principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language.
20160901T06:54:58Z
Gustison, Morgan
Semple, Stuart
Ferrer Cancho, Ramon
Bergman, Thore
Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath's law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath's law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath's law reflects compressionthe principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language.

The scaling of the minimum sum of edge lengths in uniformly random trees
http://hdl.handle.net/2117/88535
The scaling of the minimum sum of edge lengths in uniformly random trees
Esteban Ángeles, Juan Luis; Ferrer Cancho, Ramon; Gómez Rodríguez, Carlos
The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this problem on trees in polynomial time, they have remained theoretical and have not been implemented in practical contexts to our knowledge. Here we use one of those algorithms to investigate the growth of this sum as a function of the size of the tree in uniformly random trees. We show that this sum is bounded above by its value in a star tree. We also show that the mean edge length grows logarithmically in optimal linear arrangements, in stark contrast to the linear growth that is expected on optimal arrangements of star trees or on random linear arrangements.
20160706T09:32:34Z
Esteban Ángeles, Juan Luis
Ferrer Cancho, Ramon
Gómez Rodríguez, Carlos
The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this problem on trees in polynomial time, they have remained theoretical and have not been implemented in practical contexts to our knowledge. Here we use one of those algorithms to investigate the growth of this sum as a function of the size of the tree in uniformly random trees. We show that this sum is bounded above by its value in a star tree. We also show that the mean edge length grows logarithmically in optimal linear arrangements, in stark contrast to the linear growth that is expected on optimal arrangements of star trees or on random linear arrangements.

Canonical Horn representations and query learning
http://hdl.handle.net/2117/87970
Canonical Horn representations and query learning
Arias Vicente, Marta; Balcázar Navarro, José Luis
We describe an alternative construction of an existing canonical representation for definite Horn theories, the emph{GuiguesDuquenne} basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite to general Horn CNF. We show how this representation relates to two topics in query learning theory: first, we show that a wellknown algorithm by Angluin, Frazier and Pitt that learns Horn CNF always outputs the GD basis independently of the counterexamples it receives; second, we build strong polynomial certificates for Horn CNF directly from the GD basis.
20160614T10:49:34Z
Arias Vicente, Marta
Balcázar Navarro, José Luis
We describe an alternative construction of an existing canonical representation for definite Horn theories, the emph{GuiguesDuquenne} basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite to general Horn CNF. We show how this representation relates to two topics in query learning theory: first, we show that a wellknown algorithm by Angluin, Frazier and Pitt that learns Horn CNF always outputs the GD basis independently of the counterexamples it receives; second, we build strong polynomial certificates for Horn CNF directly from the GD basis.

Note: More efficient conversion of equivalencequery algorithms to PAC algorithms
http://hdl.handle.net/2117/87921
Note: More efficient conversion of equivalencequery algorithms to PAC algorithms
Gavaldà Mestre, Ricard
We present a method for transforming an Equivalencequery algorithm using Q queries into a PACalgorithm using Q/epsilon + O( (Q^(2/3) / epsilon ) * log(Q / delta) examples in expectation. The method is a variation of that by Schuurmans and Greiner (1995) which provides, for each gamma>0, an algorithm using (1+gamma)Q/epsilon + O( (1/epsilon) * log(Q / delta) examples in expectation. In other words, we show that the constant in front of the dominating term Q/epsilon can be made 1+o(1).
20160613T12:21:07Z
Gavaldà Mestre, Ricard
We present a method for transforming an Equivalencequery algorithm using Q queries into a PACalgorithm using Q/epsilon + O( (Q^(2/3) / epsilon ) * log(Q / delta) examples in expectation. The method is a variation of that by Schuurmans and Greiner (1995) which provides, for each gamma>0, an algorithm using (1+gamma)Q/epsilon + O( (1/epsilon) * log(Q / delta) examples in expectation. In other words, we show that the constant in front of the dominating term Q/epsilon can be made 1+o(1).