Articles de revista
http://hdl.handle.net/2117/3487
20170727T06:50:50Z

The entropy of wordslearnability and expressivity across more than 1000 languages
http://hdl.handle.net/2117/106703
The entropy of wordslearnability and expressivity across more than 1000 languages
Bentz, Chris; Alikaniotis, Dimitrios; Cysouw, Michael; Ferrer Cancho, Ramon
The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as nonindependence of words in cotext. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is no language in our sample with a unigram entropy of less than six bits/word. We argue that this is in line with informationtheoretic models of communication. Languages are held in a narrow range by two fundamental pressures: word learnability and word expressivity, with a potential bias towards expressivity. Secondly, there is a strong linear relationship between unigram entropies and entropy rates. The entropy difference between words with and without cotextual information is narrowly distributed around ca. three bits/word. In other words, knowing the preceding text reduces the uncertainty of words by roughly the same amount across languages of the world.
20170721T10:48:54Z
Bentz, Chris
Alikaniotis, Dimitrios
Cysouw, Michael
Ferrer Cancho, Ramon
The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as nonindependence of words in cotext. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is no language in our sample with a unigram entropy of less than six bits/word. We argue that this is in line with informationtheoretic models of communication. Languages are held in a narrow range by two fundamental pressures: word learnability and word expressivity, with a potential bias towards expressivity. Secondly, there is a strong linear relationship between unigram entropies and entropy rates. The entropy difference between words with and without cotextual information is narrowly distributed around ca. three bits/word. In other words, knowing the preceding text reduces the uncertainty of words by roughly the same amount across languages of the world.

Disclosure day on relativity: a science activity beyond the classroom
http://hdl.handle.net/2117/106559
Disclosure day on relativity: a science activity beyond the classroom
Aragoneses, Andrés; Salán Ballesteros, Maria Núria; Hernández Fernández, Antonio
An important goal for students in engineering education is the ability to present and defend a project in front of a technical audience. We have designed an activity for helping students to work the independent learning and communication skills, while they are introduced in the dynamics of a conference. In this activity, students prepare and present a poster at a popular physics conference on relativity. This activity is shown to provide them with communication skills, related to generic skills at the core of Universitat Politècnica de Catalunya (UPC) degrees, and which are relevant in most of the duties of an engineer.
20170718T08:45:53Z
Aragoneses, Andrés
Salán Ballesteros, Maria Núria
Hernández Fernández, Antonio
An important goal for students in engineering education is the ability to present and defend a project in front of a technical audience. We have designed an activity for helping students to work the independent learning and communication skills, while they are introduced in the dynamics of a conference. In this activity, students prepare and present a poster at a popular physics conference on relativity. This activity is shown to provide them with communication skills, related to generic skills at the core of Universitat Politècnica de Catalunya (UPC) degrees, and which are relevant in most of the duties of an engineer.

Random crossings in dependency trees
http://hdl.handle.net/2117/106079
Random crossings in dependency trees
Ferrer Cancho, Ramon
It has been hypothesized that the rather small number of crossings in real syntactic dependency trees is a sideeffect of pressure for dependency length minimization. Here we answer a
related important research question: what would be the expected number of crossings if the natural order of a sentence was lost and replaced by a random ordering? We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees. The expected number of crossings is minimum for a star tree (crossings are impossible) and maximum for a linear tree (the number of crossings is of the order of the square of the sequence length).
20170703T08:13:04Z
Ferrer Cancho, Ramon
It has been hypothesized that the rather small number of crossings in real syntactic dependency trees is a sideeffect of pressure for dependency length minimization. Here we answer a
related important research question: what would be the expected number of crossings if the natural order of a sentence was lost and replaced by a random ordering? We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees. The expected number of crossings is minimum for a star tree (crossings are impossible) and maximum for a linear tree (the number of crossings is of the order of the square of the sequence length).

A correction on Shiloach's algorithm for minimum linear arrangement of trees
http://hdl.handle.net/2117/106035
A correction on Shiloach's algorithm for minimum linear arrangement of trees
Esteban Ángeles, Juan Luis; Ferrer Cancho, Ramon
More than 30 years ago, Shiloach published an algorithm to solve the minimum linear arrangement problem for undirected trees. Here we fix a small error in the original version of the algorithm and discuss its effect on subsequent literature. We also improve some aspects of the notation.
20170630T11:25:05Z
Esteban Ángeles, Juan Luis
Ferrer Cancho, Ramon
More than 30 years ago, Shiloach published an algorithm to solve the minimum linear arrangement problem for undirected trees. Here we fix a small error in the original version of the algorithm and discuss its effect on subsequent literature. We also improve some aspects of the notation.

Grammar logicised: relativisation
http://hdl.handle.net/2117/105058
Grammar logicised: relativisation
Morrill, Glyn
Many variants of categorial grammar assume an underlying logic which is associative and linear. In relation to left extraction, the former property is challenged by island domains, which involve nonassociativity, and the latter property is challenged by parasitic gaps, which involve nonlinearity. We present a version of type logical grammar including ‘structural inhibition’ for nonassociativity and ‘structural facilitation’ for nonlinearity and we give an account of relativisation including islands and parasitic gaps and their interaction.
20170531T08:49:12Z
Morrill, Glyn
Many variants of categorial grammar assume an underlying logic which is associative and linear. In relation to left extraction, the former property is challenged by island domains, which involve nonassociativity, and the latter property is challenged by parasitic gaps, which involve nonlinearity. We present a version of type logical grammar including ‘structural inhibition’ for nonassociativity and ‘structural facilitation’ for nonlinearity and we give an account of relativisation including islands and parasitic gaps and their interaction.

Positive isometric averaging operators on l2(Z,µ)
http://hdl.handle.net/2117/104392
Positive isometric averaging operators on l2(Z,µ)
Boza Rocho, Santiago; Soria de Diego, Javier
© 2016 Springer International Publishing We show that positive isometric averaging operators on the sequence space (Formula presented.) are determined by very subtle arithmetic conditions on (Formula presented.) (even for very simple examples), contrary to what happens in the continuous case (Formula presented.), where any possible average value is realized by a suitable positive isometry.
20170515T07:32:34Z
Boza Rocho, Santiago
Soria de Diego, Javier
© 2016 Springer International Publishing We show that positive isometric averaging operators on the sequence space (Formula presented.) are determined by very subtle arithmetic conditions on (Formula presented.) (even for very simple examples), contrary to what happens in the continuous case (Formula presented.), where any possible average value is realized by a suitable positive isometry.

Nonuniform complexity classes specified by lower and upper bounds
http://hdl.handle.net/2117/104347
Nonuniform complexity classes specified by lower and upper bounds
Balcázar Navarro, José Luis; Gabarró Vallès, Joaquim
We characterize in terms of oracle Turing machines the classes defined by exponential lower bounds on some nonuniform complexity measures. After, we use the same methods to giue a new characterization of classes defined by polynomial and polylog upper bounds, obtaining an unified approach to deal with upper and lower bounds, The main measures are the initial index, the contextfree cosU ond the boolean circuits size. We interpret our results by discussing a trade off between oracle information and computed information for oracle Turing machines.; NOMS caractérisons en termes de machines de Turing avec oracles les classes définies par des bornes inférieures exponentielles pour des mesures de complexité non uniformes. Nous utilisons ensuite les mêmes méthodes pour donner une nouvelle caractérisation des classes définies par des bornes supérieures polynomiales et polylogarithmiques, obtenanrainsi une approche unifiée pour les bornes inférieures et supérieures. Les mesures principales sont F index initial, le coût grammatical et la taille des circuits booléens. Nous interprétons nos résultats en étudiant, pour les machines de Turing avec oracle, la relation entre l'information due à Voracle et l'information calculée par la machine.
20170512T08:15:24Z
Balcázar Navarro, José Luis
Gabarró Vallès, Joaquim
We characterize in terms of oracle Turing machines the classes defined by exponential lower bounds on some nonuniform complexity measures. After, we use the same methods to giue a new characterization of classes defined by polynomial and polylog upper bounds, obtaining an unified approach to deal with upper and lower bounds, The main measures are the initial index, the contextfree cosU ond the boolean circuits size. We interpret our results by discussing a trade off between oracle information and computed information for oracle Turing machines.
NOMS caractérisons en termes de machines de Turing avec oracles les classes définies par des bornes inférieures exponentielles pour des mesures de complexité non uniformes. Nous utilisons ensuite les mêmes méthodes pour donner une nouvelle caractérisation des classes définies par des bornes supérieures polynomiales et polylogarithmiques, obtenanrainsi une approche unifiée pour les bornes inférieures et supérieures. Les mesures principales sont F index initial, le coût grammatical et la taille des circuits booléens. Nous interprétons nos résultats en étudiant, pour les machines de Turing avec oracle, la relation entre l'information due à Voracle et l'information calculée par la machine.

Immunity and simplicity in relativizations of probabilistic complexity classes
http://hdl.handle.net/2117/103554
Immunity and simplicity in relativizations of probabilistic complexity classes
Balcázar Navarro, José Luis; Russo, David A.
The existence of immune and simple sets in relativizations of the probabilistic polynomial time bounded classes is studied. Some techniques previously used to show similar results for relativizations of P and NP are adapted to the probabilistic classes. Using these results, an exhaustive settling of all possible strong separations among these relativized classes is obtained.; On étudie les relativisations des classes de complexité probabiliste polynômiale. On adapte aux classes probabilistes des techniques déjà utilisées pour établir des résultats similaires pour les relativisations de P et NP. On obtient à partir de ces résultats une classification de toutes les propriétés de séparation forte pour ces classes relativisées.
20170419T14:00:20Z
Balcázar Navarro, José Luis
Russo, David A.
The existence of immune and simple sets in relativizations of the probabilistic polynomial time bounded classes is studied. Some techniques previously used to show similar results for relativizations of P and NP are adapted to the probabilistic classes. Using these results, an exhaustive settling of all possible strong separations among these relativized classes is obtained.
On étudie les relativisations des classes de complexité probabiliste polynômiale. On adapte aux classes probabilistes des techniques déjà utilisées pour établir des résultats similaires pour les relativisations de P et NP. On obtient à partir de ces résultats une classification de toutes les propriétés de séparation forte pour ces classes relativisées.

Sparse sets, lowness, and highness
http://hdl.handle.net/2117/103245
Sparse sets, lowness, and highness
Balcázar Navarro, José Luis; Book, R; Schoening, U
We develop the notions of “generalized lowness” for sets in PH (the union of the polynomialtime hierarchy) and of “generalized highness” for arbitrary sets. Also, we develop the notions of “extended lowness” and “extended highness” for arbitrary sets. These notions extend the decomposition of NP into low sets and high sets developed by Schöning [15] and studied by Ko and Schöning [9].
We show that either every sparse set in PH is generalized high or no sparse set in PH is generalized high. Further, either every sparse set is extended high or no sparse set is extended high. In both situations, the former case corresponds to the polynomialtime hierarchy having only finitely many levels while the latter case corresponds to the polynomialtime hierarchy extending infinitely many levels.
20170404T07:57:03Z
Balcázar Navarro, José Luis
Book, R
Schoening, U
We develop the notions of “generalized lowness” for sets in PH (the union of the polynomialtime hierarchy) and of “generalized highness” for arbitrary sets. Also, we develop the notions of “extended lowness” and “extended highness” for arbitrary sets. These notions extend the decomposition of NP into low sets and high sets developed by Schöning [15] and studied by Ko and Schöning [9].
We show that either every sparse set in PH is generalized high or no sparse set in PH is generalized high. Further, either every sparse set is extended high or no sparse set is extended high. In both situations, the former case corresponds to the polynomialtime hierarchy having only finitely many levels while the latter case corresponds to the polynomialtime hierarchy extending infinitely many levels.

Acoustic sequences in nonhuman animals: a tutorial review and prospectus
http://hdl.handle.net/2117/102816
Acoustic sequences in nonhuman animals: a tutorial review and prospectus
Kershenbaum, Arik; Blumstein, Daniel T.; Roch, Marie A.; Ferrer Cancho, Ramon
Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the wellknown example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characteriselet alone understandthe significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and nearfuture knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorialstyle introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.
20170323T08:55:32Z
Kershenbaum, Arik
Blumstein, Daniel T.
Roch, Marie A.
Ferrer Cancho, Ramon
Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the wellknown example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characteriselet alone understandthe significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and nearfuture knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorialstyle introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.