DSpace Collection:
http://hdl.handle.net/2117/3487
Fri, 03 Jul 2015 22:16:26 GMT2015-07-03T22:16:26Zwebmaster.bupc@upc.eduUniversitat Politècnica de Catalunya. Servei de Biblioteques i DocumentaciónoDisplacement logic for anaphora
http://hdl.handle.net/2117/28349
Title: Displacement logic for anaphora
Authors: Morrill, Glyn; Valentín Fernández Gallart, José Oriol
Abstract: The displacement calculus of Morrill, Valentín and Fadda (2011) [25] aspires to replace the calculus of Lambek (1958) [13] as the foundation of categorial grammar by accommodating intercalation as well as concatenation while remaining free of structural rules and enjoying Cut-elimination and its good corollaries. Jäger (2005) [11] proposes a type logical treatment of anaphora with syntactic duplication using limited contraction. Morrill and Valentín (2010) [24] apply (modal) displacement calculus to anaphora with lexical duplication and propose extension with a negation as failure in conjunction with additives to capture binding conditions. In this paper we present an account of anaphora developing characteristics and employing machinery from both of these proposals.Fri, 19 Jun 2015 09:15:45 GMThttp://hdl.handle.net/2117/283492015-06-19T09:15:45ZMorrill, Glyn; Valentín Fernández Gallart, José OriolnoAnaphora, Binding principles, Categorial logic, Cut-elimination, Displacement calculus, Negation as failureThe displacement calculus of Morrill, Valentín and Fadda (2011) [25] aspires to replace the calculus of Lambek (1958) [13] as the foundation of categorial grammar by accommodating intercalation as well as concatenation while remaining free of structural rules and enjoying Cut-elimination and its good corollaries. Jäger (2005) [11] proposes a type logical treatment of anaphora with syntactic duplication using limited contraction. Morrill and Valentín (2010) [24] apply (modal) displacement calculus to anaphora with lexical duplication and propose extension with a negation as failure in conjunction with additives to capture binding conditions. In this paper we present an account of anaphora developing characteristics and employing machinery from both of these proposals.The placement of the head that minimizes online memory: a complex systems approach
http://hdl.handle.net/2117/28306
Title: The placement of the head that minimizes online memory: a complex systems approach
Authors: Ferrer Cancho, Ramon
Abstract: It is well known that the length of a syntactic dependency determines its online memory cost. Thus, the problem of the placement of a head and its dependents (complements or modifiers) that minimizes online memory is equivalent to the problem of the minimum linear arrangement of a star tree. However, how that length is translated into cognitive cost is not known. This study shows that the online memory cost is minimized when the head is placed at the center, regardless of the function that transforms length into cost, provided only that this function is strictly monotonically increasing. Online memory defines a quasi-convex adaptive landscape with a single central minimum if the number of elements is odd and two central minima if that number is even. We discuss various aspects of the dynamics of word order of subject (S), verb (V) and object (O) from a complex systems perspective and suggest that word orders tend to evolve by swapping adjacent constituents from an initial or early SOV configuration that is attracted towards a central word order by online memory minimization. We also suggest that the stability of SVO is due to at least two factors, the quasi-convex shape of the adaptive landscape in the online memory dimension and online memory adaptations that avoid regression to SOV. Although OVS is also optimal for placing the verb at the center, its low frequency is explained by its long distance to the seminal SOV in the permutation space.Mon, 15 Jun 2015 11:41:48 GMThttp://hdl.handle.net/2117/283062015-06-15T11:41:48ZFerrer Cancho, RamonnoLanguage dynamics, Neutrality, Adaptive landscape, Head placement, Language evolution, Word orderIt is well known that the length of a syntactic dependency determines its online memory cost. Thus, the problem of the placement of a head and its dependents (complements or modifiers) that minimizes online memory is equivalent to the problem of the minimum linear arrangement of a star tree. However, how that length is translated into cognitive cost is not known. This study shows that the online memory cost is minimized when the head is placed at the center, regardless of the function that transforms length into cost, provided only that this function is strictly monotonically increasing. Online memory defines a quasi-convex adaptive landscape with a single central minimum if the number of elements is odd and two central minima if that number is even. We discuss various aspects of the dynamics of word order of subject (S), verb (V) and object (O) from a complex systems perspective and suggest that word orders tend to evolve by swapping adjacent constituents from an initial or early SOV configuration that is attracted towards a central word order by online memory minimization. We also suggest that the stability of SVO is due to at least two factors, the quasi-convex shape of the adaptive landscape in the online memory dimension and online memory adaptations that avoid regression to SOV. Although OVS is also optimal for placing the verb at the center, its low frequency is explained by its long distance to the seminal SOV in the permutation space.Reply to the commentary "Be careful when assuming the obvious", by P. Alday
http://hdl.handle.net/2117/28305
Title: Reply to the commentary "Be careful when assuming the obvious", by P. Alday
Authors: Ferrer Cancho, Ramon
Abstract: Here we respond to some comments by Alday concerning headedness in linguistic theory and the validity of the assumptions of a mathematical model for word order. For brevity, we focus only on two assumptions: the unit of measurement of dependency length and the monotonicity of the cost of a dependency as a function of its length. We also revise the implicit psychological bias in Alday’s comments. Notwithstanding, Alday is indicating the path for linguistic research with his unusual concerns about parsimony from multiple dimensions.Mon, 15 Jun 2015 11:27:58 GMThttp://hdl.handle.net/2117/283052015-06-15T11:27:58ZFerrer Cancho, RamonnoClitics, Units of measurement, Headedness, Language evolution, Word order, Principles and parameters theoryHere we respond to some comments by Alday concerning headedness in linguistic theory and the validity of the assumptions of a mathematical model for word order. For brevity, we focus only on two assumptions: the unit of measurement of dependency length and the monotonicity of the cost of a dependency as a function of its length. We also revise the implicit psychological bias in Alday’s comments. Notwithstanding, Alday is indicating the path for linguistic research with his unusual concerns about parsimony from multiple dimensions.The risks of mixing dependency lengths from sequences of different length
http://hdl.handle.net/2117/28279
Title: The risks of mixing dependency lengths from sequences of different length
Authors: Ferrer Cancho, Ramon; Liu, Haitao
Abstract: Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length. The distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors.Thu, 11 Jun 2015 11:35:43 GMThttp://hdl.handle.net/2117/282792015-06-11T11:35:43ZFerrer Cancho, Ramon; Liu, HaitaonoSyntactic dependency, Syntax, Dependency lengthMixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length. The distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors.Beyond description: Comment on “Approaching human language with complex networks” by Cong and Liu
http://hdl.handle.net/2117/28273
Title: Beyond description: Comment on “Approaching human language with complex networks” by Cong and Liu
Authors: Ferrer Cancho, RamonThu, 11 Jun 2015 10:03:57 GMThttp://hdl.handle.net/2117/282732015-06-11T10:03:57ZFerrer Cancho, RamonnoA categorial type logic
http://hdl.handle.net/2117/28269
Title: A categorial type logic
Authors: Morrill, Glyn
Abstract: In logical categorial grammar [23,11] syntactic structures are categorial proofs and semantic structures are intuitionistic proofs, and the syntax-semantics interface comprises a homomorphism from syntactic proofs to semantic proofs. Thereby, logical categorial grammar embodies in a pure logical form the principles of compositionality, lex-icalism, and parsing as deduction. Interest has focused on multimodal versions but the advent of the (dis)placement calculus of Morrill, Valentín and Fadda [21] suggests that the role of structural rules can be reduced, and this facilitates computational implementation. In this paper we specify a comprehensive formalism of (dis) placement logic for the parser/theorem prover CatLog integrating categorial logic connectives proposed to date and illustrate with a cover grammar of the Montague fragment.Thu, 11 Jun 2015 08:22:10 GMThttp://hdl.handle.net/2117/282692015-06-11T08:22:10ZMorrill, GlynnoIn logical categorial grammar [23,11] syntactic structures are categorial proofs and semantic structures are intuitionistic proofs, and the syntax-semantics interface comprises a homomorphism from syntactic proofs to semantic proofs. Thereby, logical categorial grammar embodies in a pure logical form the principles of compositionality, lex-icalism, and parsing as deduction. Interest has focused on multimodal versions but the advent of the (dis)placement calculus of Morrill, Valentín and Fadda [21] suggests that the role of structural rules can be reduced, and this facilitates computational implementation. In this paper we specify a comprehensive formalism of (dis) placement logic for the parser/theorem prover CatLog integrating categorial logic connectives proposed to date and illustrate with a cover grammar of the Montague fragment.Adaptively learning probabilistic deterministic automata from data streams
http://hdl.handle.net/2117/28256
Title: Adaptively learning probabilistic deterministic automata from data streams
Authors: Balle Pigem, Borja de; Castro Rabal, Jorge; Gavaldà Mestre, Ricard
Abstract: Markovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specific classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for inferring models in this class in the restrictive data stream scenario: Unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We also present extensions of the algorithm that (1) reduce to a minimum the need for guessing parameters of the target distribution and (2) are able to adapt to changes in the input distribution, relearning new models when needed. We provide rigorous PAC-like bounds for all of the above. Our algorithm makes a key usage of stream sketching techniques for reducing memory and processing time, and is modular in that it can use different tests for state equivalence and for change detection in the stream.Wed, 10 Jun 2015 12:27:22 GMThttp://hdl.handle.net/2117/282562015-06-10T12:27:22ZBalle Pigem, Borja de; Castro Rabal, Jorge; Gavaldà Mestre, RicardnoPAC learning, Data streams, Probabilistic automata, PDFA, Stream sketchesMarkovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specific classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for inferring models in this class in the restrictive data stream scenario: Unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We also present extensions of the algorithm that (1) reduce to a minimum the need for guessing parameters of the target distribution and (2) are able to adapt to changes in the input distribution, relearning new models when needed. We provide rigorous PAC-like bounds for all of the above. Our algorithm makes a key usage of stream sketching techniques for reducing memory and processing time, and is modular in that it can use different tests for state equivalence and for change detection in the stream.Learning read-constant polynomials of constant degree modulo composites
http://hdl.handle.net/2117/28159
Title: Learning read-constant polynomials of constant degree modulo composites
Authors: Chattopadhyay, Arkadev; Gavaldà Mestre, Ricard; Arnsfelt Hansen, Kristoffer; Thérien, Denis
Abstract: Boolean functions that have constant degree polynomial representation over a fixed finite ring form a natural and strict subclass of the complexity class ACC0. They are also precisely the functions computable efficiently by programs over fixed and finite nilpotent groups. This class is not known to be learnable in any reasonable learning model. In this paper, we provide a deterministic polynomial time algorithm for learning Boolean functions represented by polynomials of constant degree over arbitrary finite rings from membership queries, with the additional constraint that each variable in the target polynomial appears in a constant number of monomials. Our algorithm extends to superconstant but low degree polynomials and still runs in quasipolynomial time.Wed, 03 Jun 2015 09:08:42 GMThttp://hdl.handle.net/2117/281592015-06-03T09:08:42ZChattopadhyay, Arkadev; Gavaldà Mestre, Ricard; Arnsfelt Hansen, Kristoffer; Thérien, DenisnoPolynomials over finite rings, Exact learning, Membership queries, Modular gatesBoolean functions that have constant degree polynomial representation over a fixed finite ring form a natural and strict subclass of the complexity class ACC0. They are also precisely the functions computable efficiently by programs over fixed and finite nilpotent groups. This class is not known to be learnable in any reasonable learning model. In this paper, we provide a deterministic polynomial time algorithm for learning Boolean functions represented by polynomials of constant degree over arbitrary finite rings from membership queries, with the additional constraint that each variable in the target polynomial appears in a constant number of monomials. Our algorithm extends to superconstant but low degree polynomials and still runs in quasipolynomial time.Semantically inactive multiplicatives and words as types
http://hdl.handle.net/2117/28038
Title: Semantically inactive multiplicatives and words as types
Authors: Morrill, Glyn; Valentín Fernández Gallart, José Oriol
Abstract: The literature on categorial type logic includes proposals for semantically inactive additives, quantifiers, and modalities (Morrill 1994[17]; Hepple 1990[2]; Moortgat 1997[9]), but to our knowledge there has been no proposal for semantically inactive multiplicatives. In this paper we formulate such a proposal (thus filling a gap in the typology of categorial connectives) in the context of the displacement calculus Morrill et al. (2011[16]), and we give a formulation of words as types whereby for every expression w there is a corresponding type W(w). We show how this machinary can treat the syntax and semantics of collocations involving apparently contentless words such as expletives, particle verbs, and (discontinuous) idioms. In addition, we give an account in these terms of the only known examples treated by Hybrid Type Logical Grammar (HTLG henceforth; Kubota and Levine 2012[4]) beyond the scope of unaugmented displacement calculus: gapping of particle verbs and discontinuous idioms.Mon, 25 May 2015 15:23:38 GMThttp://hdl.handle.net/2117/280382015-05-25T15:23:38ZMorrill, Glyn; Valentín Fernández Gallart, José OriolnoCalculations, SemanticsThe literature on categorial type logic includes proposals for semantically inactive additives, quantifiers, and modalities (Morrill 1994[17]; Hepple 1990[2]; Moortgat 1997[9]), but to our knowledge there has been no proposal for semantically inactive multiplicatives. In this paper we formulate such a proposal (thus filling a gap in the typology of categorial connectives) in the context of the displacement calculus Morrill et al. (2011[16]), and we give a formulation of words as types whereby for every expression w there is a corresponding type W(w). We show how this machinary can treat the syntax and semantics of collocations involving apparently contentless words such as expletives, particle verbs, and (discontinuous) idioms. In addition, we give an account in these terms of the only known examples treated by Hybrid Type Logical Grammar (HTLG henceforth; Kubota and Levine 2012[4]) beyond the scope of unaugmented displacement calculus: gapping of particle verbs and discontinuous idioms.Isometries on L-2(X) and monotone functions
http://hdl.handle.net/2117/27512
Title: Isometries on L-2(X) and monotone functions
Authors: Boza Rocho, Santiago; Soria, Javier
Abstract: We study necessary and sufficient conditions on a bounded operator T defined on the Hilbert space L-2(X) to be an isometry and show that, under suitable hypotheses, it suffices to restrict T to a smaller class of functions (e.g., if X = R+, to the cone of positive and decreasing functions). We also consider the problem of characterizing the sets Y subset of X for which the orthogonal projection of the operator T on L-2(Y) is also an isometry. Finally, we illustrate our results with several examples involving classical operators on different settings. (C) 2013 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimWed, 22 Apr 2015 11:07:17 GMThttp://hdl.handle.net/2117/275122015-04-22T11:07:17ZBoza Rocho, Santiago; Soria, JaviernoIsometries, Hardy operator, monotone functions, Hardy operator, Minus, Decreasing functions, Measure-spaces, Inequalities, Identity, ConeWe study necessary and sufficient conditions on a bounded operator T defined on the Hilbert space L-2(X) to be an isometry and show that, under suitable hypotheses, it suffices to restrict T to a smaller class of functions (e.g., if X = R+, to the cone of positive and decreasing functions). We also consider the problem of characterizing the sets Y subset of X for which the orthogonal projection of the operator T on L-2(Y) is also an isometry. Finally, we illustrate our results with several examples involving classical operators on different settings. (C) 2013 WILEY-VCH Verlag GmbH & Co. KGaA, WeinheimWhen is Menzerath-Altmann law mathematically trivial? A new approach
http://hdl.handle.net/2117/27198
Title: When is Menzerath-Altmann law mathematically trivial? A new approach
Authors: Ferrer Cancho, Ramon; Hernández Fernández, Antonio; Baixeries i Juvillà, Jaume; Debowski, Lukasz; Macutek, Jan
Abstract: Menzerath’s law, the tendency of Z (the mean size of the parts) to decrease as X (the number of parts) increases, is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z~1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z~1/X in genomes. The most powerful test is a new non-parametric one based upon the correlation ratio, which is able to reject Z~1/X in nine out of 11 taxonomic groups and detect a borderline group. Rather than a fact, Z~1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed.Thu, 09 Apr 2015 09:05:34 GMThttp://hdl.handle.net/2117/271982015-04-09T09:05:34ZFerrer Cancho, Ramon; Hernández Fernández, Antonio; Baixeries i Juvillà, Jaume; Debowski, Lukasz; Macutek, JannoMenzerath-Altmann law, Power-lawsMenzerath’s law, the tendency of Z (the mean size of the parts) to decrease as X (the number of parts) increases, is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z~1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z~1/X in genomes. The most powerful test is a new non-parametric one based upon the correlation ratio, which is able to reject Z~1/X in nine out of 11 taxonomic groups and detect a borderline group. Rather than a fact, Z~1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed.El origen de la física moderna : el papel de Fermi
http://hdl.handle.net/2117/24202
Title: El origen de la física moderna : el papel de Fermi
Authors: Hernández Fernández, Antonio
Abstract: La primera mitad del siglo XX fue crucial para que la ciencia moderna se estableciese tal y como la conocemos. El ser humano extendió su comprensión del mundo tras explorar el mundo atómico y el astronómico. Entender lo inmenso y lo ínfimo permitió desarrollar la tecnología nuclear e inventar nuevos materiales, clave en la revolución electrónica de la segunda mitad del siglo XX.
La historia de la ciencia de este periodo fue un auténtico encuentro multidisciplinar en el que la física jugó un papel central y vertebrador, con una figura paradigmática: Enrico Fermi.
La vida de Fermi ejemplifica y plasma el tránsito a la ciencia del siglo XXI: trabajo en equipo, conexión entre la teoría y la experimentación, pedagogía y transferencia de tecnología, fueron elementos fundamentales de la ciencia de FermiThu, 02 Oct 2014 08:10:38 GMThttp://hdl.handle.net/2117/242022014-10-02T08:10:38ZHernández Fernández, AntonionoFermi, Enrico (1901-1954), Física nuclear, Història de la Física, Proyecto Manhattan, Instituto Físico di Via PanispernaLa primera mitad del siglo XX fue crucial para que la ciencia moderna se estableciese tal y como la conocemos. El ser humano extendió su comprensión del mundo tras explorar el mundo atómico y el astronómico. Entender lo inmenso y lo ínfimo permitió desarrollar la tecnología nuclear e inventar nuevos materiales, clave en la revolución electrónica de la segunda mitad del siglo XX.
La historia de la ciencia de este periodo fue un auténtico encuentro multidisciplinar en el que la física jugó un papel central y vertebrador, con una figura paradigmática: Enrico Fermi.
La vida de Fermi ejemplifica y plasma el tránsito a la ciencia del siglo XXI: trabajo en equipo, conexión entre la teoría y la experimentación, pedagogía y transferencia de tecnología, fueron elementos fundamentales de la ciencia de FermiCompression as a universal principle of animal behavior
http://hdl.handle.net/2117/24008
Title: Compression as a universal principle of animal behavior
Authors: Ferrer Cancho, Ramon; Hernández Fernández, Antonio; Lusseau, David; Agoramoorthy, Govindasamy; Hsu, Minna J.; Semple, Stuart
Abstract: A key aim in biology and psychology is to identify fundamental principles underpinning the behavior of animals, including humans. Analyses of human language and the behavior of a range of non-human animal species have provided evidence for a common pattern underlying diverse behavioral phenomena: Words follow Zipf’s law of brevity (the tendency of more frequently used words to be shorter), and conformity to this general pattern has been seen in the behavior of a number of other animals. It has been argued that the presence of this law is a sign of efficient coding in the information theoretic sense. However, no strong direct connection has been demonstrated between the law and compression, the information theoretic principle of minimizing the expected length of a code. Here, we show that minimizing the expected code length implies that the length of a word cannot increase as its frequency increases. Furthermore, we show that the mean code length or duration is significantly small in human language, and also in the behavior of other species in all cases where agreement with the law of brevity has been found. We argue that compression is a general principle of animal behavior that reflects selection for efficiency of coding.Tue, 09 Sep 2014 09:27:18 GMThttp://hdl.handle.net/2117/240082014-09-09T09:27:18ZFerrer Cancho, Ramon; Hernández Fernández, Antonio; Lusseau, David; Agoramoorthy, Govindasamy; Hsu, Minna J.; Semple, StuartnoLaw of brevity, Compression, Language, Animal behavior, Linguistic universalsA key aim in biology and psychology is to identify fundamental principles underpinning the behavior of animals, including humans. Analyses of human language and the behavior of a range of non-human animal species have provided evidence for a common pattern underlying diverse behavioral phenomena: Words follow Zipf’s law of brevity (the tendency of more frequently used words to be shorter), and conformity to this general pattern has been seen in the behavior of a number of other animals. It has been argued that the presence of this law is a sign of efficient coding in the information theoretic sense. However, no strong direct connection has been demonstrated between the law and compression, the information theoretic principle of minimizing the expected length of a code. Here, we show that minimizing the expected code length implies that the length of a word cannot increase as its frequency increases. Furthermore, we show that the mean code length or duration is significantly small in human language, and also in the behavior of other species in all cases where agreement with the law of brevity has been found. We argue that compression is a general principle of animal behavior that reflects selection for efficiency of coding.Characterizing functional dependencies in formal concept analysis with pattern structures
http://hdl.handle.net/2117/21485
Title: Characterizing functional dependencies in formal concept analysis with pattern structures
Authors: Baixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, Amedeo
Abstract: Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization.
Whereas it has been deeply investigated in those fields, strong links exist with
the mathematical framework of Formal Concept Analysis. Considering the discovery of
functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures,
an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.Fri, 07 Feb 2014 20:12:07 GMThttp://hdl.handle.net/2117/214852014-02-07T20:12:07ZBaixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, AmedeonoAssociation rules, Attribute implications, Data dependencies, Pattern
structures, Formal concept analysisComputing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization.
Whereas it has been deeply investigated in those fields, strong links exist with
the mathematical framework of Formal Concept Analysis. Considering the discovery of
functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures,
an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.Spectral learning of weighted automata: a forward-backward perspective
http://hdl.handle.net/2117/21075
Title: Spectral learning of weighted automata: a forward-backward perspective
Authors: Balle Pigem, Borja de; Carreras Pérez, Xavier; Luque, Franco M.; Quattoni, Ariadna Julieta
Abstract: In recent years we have seen the development of efficient provably correct algorithms for learning Weighted Finite Automata (WFA). Most of these algorithms avoid the known hardness results by defining parameters beyond the number of states that can be used to quantify the complexity of learning automata under a particular distribution. One such class of methods are the so-called spectral algorithms that measure learning complexity in terms of the smallest singular value of some Hankel matrix. However, despite their simplicity and wide applicability to real problems, their impact in application domains remains marginal to this date. One of the goals of this paper is to remedy this situation by presenting a derivation of the spectral method for learning WFA that—without sacrificing rigor and mathematical elegance—puts emphasis on providing intuitions on the inner workings of the method and does not assume a strong background in formal algebraic methods. In addition, our algorithm overcomes some of the shortcomings of previous work and is able to learn from statistics of substrings. To illustrate the approach we present experiments on a real application of the method to natural language parsing.Fri, 20 Dec 2013 11:07:41 GMThttp://hdl.handle.net/2117/210752013-12-20T11:07:41ZBalle Pigem, Borja de; Carreras Pérez, Xavier; Luque, Franco M.; Quattoni, Ariadna JulietanoSpectral learning
Weighted finite automata
Dependency parsingIn recent years we have seen the development of efficient provably correct algorithms for learning Weighted Finite Automata (WFA). Most of these algorithms avoid the known hardness results by defining parameters beyond the number of states that can be used to quantify the complexity of learning automata under a particular distribution. One such class of methods are the so-called spectral algorithms that measure learning complexity in terms of the smallest singular value of some Hankel matrix. However, despite their simplicity and wide applicability to real problems, their impact in application domains remains marginal to this date. One of the goals of this paper is to remedy this situation by presenting a derivation of the spectral method for learning WFA that—without sacrificing rigor and mathematical elegance—puts emphasis on providing intuitions on the inner workings of the method and does not assume a strong background in formal algebraic methods. In addition, our algorithm overcomes some of the shortcomings of previous work and is able to learn from statistics of substrings. To illustrate the approach we present experiments on a real application of the method to natural language parsing.