Simple semi-supervised dependency parsing
Visualitza/Obre
Tipus de documentText en actes de congrés
Data publicació2008
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled second-order parsing, we improve from a baseline accuracy of 92:02% to 93:16%, and in the case of Czech unlabeled second-order parsing, we improve from a baseline accuracy of 86:13% to 87:13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of supervised data required to reach a desired level of performance.
CitacióKoo, T.; Carreras, X.; Collins, M. Simple semi-supervised dependency parsing. A: Annual Meeting of the Association for Computational Linguistics. "46th Annual Meeting of the Association for Computational Linguistics". Columbus, Ohio: 2008, p. 595-603.
Versió de l'editorhttp://aclweb.org/anthology-new/P/P08/P08-1068.pdf
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
koo08acl.pdf | 154,5Kb | Visualitza/Obre |