LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
http://hdl.handle.net/2117/3486
Thu, 19 Apr 2018 23:32:34 GMT2018-04-19T23:32:34ZLARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatgehttp://upcommons.upc.edu/bitstream/id/906643/
http://hdl.handle.net/2117/3486
Count-invariance including exponentials
http://hdl.handle.net/2117/114970
Count-invariance including exponentials
Kuznetsov, Stepan; Morrill, Glyn; Valentín, Oriol
We define infinitary count-invariance for categorial logic, extending countinvariance for multiplicatives (van Benthem, 1991) and additives and bracket modalities (Valentín et al., 2013) to include exponentials. This provides an effective tool for pruning proof search in categorial parsing/theorem-proving.
Fri, 09 Mar 2018 09:16:19 GMThttp://hdl.handle.net/2117/1149702018-03-09T09:16:19ZKuznetsov, StepanMorrill, GlynValentín, OriolWe define infinitary count-invariance for categorial logic, extending countinvariance for multiplicatives (van Benthem, 1991) and additives and bracket modalities (Valentín et al., 2013) to include exponentials. This provides an effective tool for pruning proof search in categorial parsing/theorem-proving.Identifiability and transportability in dynamic causal networks
http://hdl.handle.net/2117/114969
Identifiability and transportability in dynamic causal networks
Blondel, Gilles; Arias Vicente, Marta; Gavaldà Mestre, Ricard
In this paper, we propose a causal analog to the purely observational dynamic Bayesian networks, which we call dynamic causal networks. We provide a sound and complete algorithm for the identification of causal effects in dynamic causal networks, namely for computing the effect of an intervention or experiment given a dynamic causal network and probability distributions of passive observations of its variables, whenever possible. We note the existence of two types of hidden confounder variables that affect in substantially different ways the identification procedures, a distinction with no analog in either dynamic Bayesian networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in dynamic causal network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain.
Fri, 09 Mar 2018 08:58:34 GMThttp://hdl.handle.net/2117/1149692018-03-09T08:58:34ZBlondel, GillesArias Vicente, MartaGavaldà Mestre, RicardIn this paper, we propose a causal analog to the purely observational dynamic Bayesian networks, which we call dynamic causal networks. We provide a sound and complete algorithm for the identification of causal effects in dynamic causal networks, namely for computing the effect of an intervention or experiment given a dynamic causal network and probability distributions of passive observations of its variables, whenever possible. We note the existence of two types of hidden confounder variables that affect in substantially different ways the identification procedures, a distinction with no analog in either dynamic Bayesian networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in dynamic causal network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain.Clustering patients with tensor decomposition
http://hdl.handle.net/2117/114857
Clustering patients with tensor decomposition
Ruffini, Matteo; Gavaldà Mestre, Ricard; Limón, Esther
In this paper we present a method for the unsupervised clustering of high-dimensional binary data, with a special focus on electronic healthcare records. We present a robust and efficient heuristic to
face this problem using tensor decomposition. We present the reasons why this approach is preferable for tasks such as clustering patient records, to more commonly used distance-based methods.
We run the algorithm on two datasets of healthcare records, obtaining clinically meaningful results.
Tue, 06 Mar 2018 13:18:28 GMThttp://hdl.handle.net/2117/1148572018-03-06T13:18:28ZRuffini, MatteoGavaldà Mestre, RicardLimón, EstherIn this paper we present a method for the unsupervised clustering of high-dimensional binary data, with a special focus on electronic healthcare records. We present a robust and efficient heuristic to
face this problem using tensor decomposition. We present the reasons why this approach is preferable for tasks such as clustering patient records, to more commonly used distance-based methods.
We run the algorithm on two datasets of healthcare records, obtaining clinically meaningful results.Thoughts about disordered thinking: measuring and quantifying the laws of order and disorder
http://hdl.handle.net/2117/114721
Thoughts about disordered thinking: measuring and quantifying the laws of order and disorder
Elvevaag, Brita; Foltz, Peter W.; Rosenstein, Mark; Ferrer Cancho, Ramon; Deyne, Simon De; Mizraji, Eduardo; Cohen, Alex
Fri, 02 Mar 2018 07:58:32 GMThttp://hdl.handle.net/2117/1147212018-03-02T07:58:32ZElvevaag, BritaFoltz, Peter W.Rosenstein, MarkFerrer Cancho, RamonDeyne, Simon DeMizraji, EduardoCohen, AlexClassifier selection with permutation tests
http://hdl.handle.net/2117/114678
Classifier selection with permutation tests
Arias Vicente, Marta; Arratia Quesada, Argimiro Alejandro; Duarte Lopez, Ariel
This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations.
Thu, 01 Mar 2018 11:38:59 GMThttp://hdl.handle.net/2117/1146782018-03-01T11:38:59ZArias Vicente, MartaArratia Quesada, Argimiro AlejandroDuarte Lopez, ArielThis work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations.Adarules: Learning rules for real-time road-traffic prediction
http://hdl.handle.net/2117/114367
Adarules: Learning rules for real-time road-traffic prediction
Mena Yedra, Rafael; Gavaldà Mestre, Ricard; Casas Vilaró, Jordi
Traffic management is being more important than ever, especially in overcrowded big cities with over-pollution problems and with new unprecedented mobility changes. In this scenario, road-traffic prediction plays a key role within Intelligent Transportation Systems, allowing traffic managers to be able to anticipate and take the proper decisions. This paper aims to analyze the situation in a commercial real-time prediction system with its current problems and limitations. We analyze issues related to the use of spatiotemporal information to reconstruct the traffic state. The analysis unveils the trade-off between simple parsimonious models and more complex models. Finally, we propose an enriched machine learning framework, Adarules, for the traffic state prediction in real-time facing the problem as continuously incoming data streams with all the commonly occurring problems in such volatile scenario, namely changes in the network infrastructure and demand, new detection stations or failure ones, among others. The framework is also able to infer automatically the most relevant features to our end-task, including the relationships within the road network, which we call as “structure learning”. Although the intention with the proposed framework is to evolve and grow with new incoming big data, however there is no limitation in starting to use it without any prior knowledge as it can starts learning the structure and parameters automatically from data.
(Part of special issue: 20th EURO Working Group on Transportation Meeting, EWGT 2017, 4-6 September 2017, Budapest, Hungary)
Thu, 22 Feb 2018 12:08:49 GMThttp://hdl.handle.net/2117/1143672018-02-22T12:08:49ZMena Yedra, RafaelGavaldà Mestre, RicardCasas Vilaró, JordiTraffic management is being more important than ever, especially in overcrowded big cities with over-pollution problems and with new unprecedented mobility changes. In this scenario, road-traffic prediction plays a key role within Intelligent Transportation Systems, allowing traffic managers to be able to anticipate and take the proper decisions. This paper aims to analyze the situation in a commercial real-time prediction system with its current problems and limitations. We analyze issues related to the use of spatiotemporal information to reconstruct the traffic state. The analysis unveils the trade-off between simple parsimonious models and more complex models. Finally, we propose an enriched machine learning framework, Adarules, for the traffic state prediction in real-time facing the problem as continuously incoming data streams with all the commonly occurring problems in such volatile scenario, namely changes in the network infrastructure and demand, new detection stations or failure ones, among others. The framework is also able to infer automatically the most relevant features to our end-task, including the relationships within the road network, which we call as “structure learning”. Although the intention with the proposed framework is to evolve and grow with new incoming big data, however there is no limitation in starting to use it without any prior knowledge as it can starts learning the structure and parameters automatically from data.
(Part of special issue: 20th EURO Working Group on Transportation Meeting, EWGT 2017, 4-6 September 2017, Budapest, Hungary)Towards a theory of word order: comment on "Dependency distance: a new perspective on syntactic patterns in natural language" by Haitao Liu et al.
http://hdl.handle.net/2117/114362
Towards a theory of word order: comment on "Dependency distance: a new perspective on syntactic patterns in natural language" by Haitao Liu et al.
Ferrer Cancho, Ramon
Thu, 22 Feb 2018 10:49:21 GMThttp://hdl.handle.net/2117/1143622018-02-22T10:49:21ZFerrer Cancho, RamonParsing logical grammar: CatLog3
http://hdl.handle.net/2117/113964
Parsing logical grammar: CatLog3
Morrill, Glyn
CatLog3 is a Prolog parser/theorem-prover for (type) logical (categorial) grammar. In such logical grammar, grammar is reduced to logic: a string of words is grammatical if and only if an associated logical statement is a theorem. CalLog3 implements a logic extending displacement calculus, a sublinear fragment including as primitive connectives the continuous (Lambek) and discontinuous wrapping connectives of the displacement calculus, additives, 1st order quantifiers, normal modalities, bracket modalities and subexponentials. In this paper we survey how CatLog3 is implemented on the principles of Andreoli’s focusing and a generalisation of van Benthem’s
count-invariance.
Fri, 09 Feb 2018 10:11:49 GMThttp://hdl.handle.net/2117/1139642018-02-09T10:11:49ZMorrill, GlynCatLog3 is a Prolog parser/theorem-prover for (type) logical (categorial) grammar. In such logical grammar, grammar is reduced to logic: a string of words is grammatical if and only if an associated logical statement is a theorem. CalLog3 implements a logic extending displacement calculus, a sublinear fragment including as primitive connectives the continuous (Lambek) and discontinuous wrapping connectives of the displacement calculus, additives, 1st order quantifiers, normal modalities, bracket modalities and subexponentials. In this paper we survey how CatLog3 is implemented on the principles of Andreoli’s focusing and a generalisation of van Benthem’s
count-invariance.A polynomial-time algorithm for the Lambek calculus with brackets of bounded order
http://hdl.handle.net/2117/113963
A polynomial-time algorithm for the Lambek calculus with brackets of bounded order
Kanovich, Max; Kuznetsov, Stepan; Morrill, Glyn; Scedrov, Andre
Lambek calculus is a logical foundation of categorial grammar, a linguistic paradigm of grammar as logic and parsing as deduction. Pentus (2010) gave a polynomial-time algorithm for determining
provability of bounded depth formulas in L*, the Lambek calculus with empty antecedents allowed. Pentus’ algorithm is based on tabularisation of proof nets. Lambek calculus with brackets is a conservative extension of Lambek calculus with bracket modalities, suitable for the modeling of syntactical domains. In this paper we give an algorithm for provability in Lb*, the Lambek calculus with brackets allowing empty antecedents. Our algorithm runs in polynomial time when both the formula depth and the bracket nesting depth are bounded. It combines a Pentus-style tabularisation of proof nets with an automata-theoretic treatment of bracketing.
Fri, 09 Feb 2018 09:57:17 GMThttp://hdl.handle.net/2117/1139632018-02-09T09:57:17ZKanovich, MaxKuznetsov, StepanMorrill, GlynScedrov, AndreLambek calculus is a logical foundation of categorial grammar, a linguistic paradigm of grammar as logic and parsing as deduction. Pentus (2010) gave a polynomial-time algorithm for determining
provability of bounded depth formulas in L*, the Lambek calculus with empty antecedents allowed. Pentus’ algorithm is based on tabularisation of proof nets. Lambek calculus with brackets is a conservative extension of Lambek calculus with bracket modalities, suitable for the modeling of syntactical domains. In this paper we give an algorithm for provability in Lb*, the Lambek calculus with brackets allowing empty antecedents. Our algorithm runs in polynomial time when both the formula depth and the bracket nesting depth are bounded. It combines a Pentus-style tabularisation of proof nets with an automata-theoretic treatment of bracketing.Does like seek like? The formation of working groups in a programming project
http://hdl.handle.net/2117/112480
Does like seek like? The formation of working groups in a programming project
Sanou Gozalo, Eduard; Hernández Fernández, Antonio; Arias Vicente, Marta; Ferrer Cancho, Ramon
In a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from teachers. The analysis of the working groups made indicates that students do not tend to associate with students with a similar academic performance, perhaps because general cognitive parameters do not drive the choice of academic partners. Pair programming seems to give great results, so the efforts of future research in this field should focus precisely on how these pairs are formed, underpinning the mechanisms of human social interactions.
Tue, 09 Jan 2018 11:08:50 GMThttp://hdl.handle.net/2117/1124802018-01-09T11:08:50ZSanou Gozalo, EduardHernández Fernández, AntonioArias Vicente, MartaFerrer Cancho, RamonIn a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from teachers. The analysis of the working groups made indicates that students do not tend to associate with students with a similar academic performance, perhaps because general cognitive parameters do not drive the choice of academic partners. Pair programming seems to give great results, so the efforts of future research in this field should focus precisely on how these pairs are formed, underpinning the mechanisms of human social interactions.