Correlations in the organization of large-scale syntactic dependency networks
Document typeConference report
PublisherAssociation for Computational Linguistics
Rights accessOpen Access
We study the correlations in the connectivity patterns of large scale syntactic dependency networks. These networks are induced from treebanks: their vertices denote word forms which occur as nuclei of dependency trees. Their edges connect pairs of vertices if at least two instance nuclei of these vertices are linked in the dependency structure of a sentence. We examine the syntactic dependency networks of seven languages. In all these cases, we consistently obtain three findings. Firstly, clustering, i.e., the probability that two vertices which are linked to a common vertex are linked on their part, is much higher than expected by chance. Secondly, the mean clustering of vertices decreases with their degree — this finding suggests the presence of a hierarchical network organization. Thirdly, the mean degree of the nearest neighbors of a vertex x tends to decrease as the degree of x grows — this finding indicates disassortative mixing in the sense that links tend to connect vertices of dissimilar degrees. Our results indicate the existence of common patterns in the large scale organization of syntactic dependency networks.
CitationFerrer-i-Cancho, R. [et al.]. Correlations in the organization of large-scale syntactic dependency networks. A: Workshop on Graph-Based Algorithms for Natural Language Processing. "HLT-NAACL 2007 - TextGraphs 2007: Graph-Based Algorithms for Natural Language Processing: proceedings of the Workshop 2007". Association for Computational Linguistics, 2007, p. 65-72.