Mostra el registre d'ítem simple

dc.contributor.authorMàrquez Villodre, Lluís
dc.contributor.authorPadró, Lluís
dc.contributor.authorRodríguez Hontoria, Horacio
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.date.accessioned2016-11-11T09:37:06Z
dc.date.available2016-11-11T09:37:06Z
dc.date.issued1997-12
dc.identifier.citationMarquez, L., Padro, L., Rodriguez, H. "A Machine learning approach to POS tagging". 1997.
dc.identifier.urihttp://hdl.handle.net/2117/96517
dc.description.abstractWe have applied inductive learning of statistical decision trees and relaxation labelling to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). The learning process is supervised and obtains a language model oriented to resolve POS ambiguities. This model consists of a set of statistical decision trees expressing distribution of tags and words in some relevant contexts. The acquired language models are complete enough to be directly used as sets of POS disambiguation rules, and include more complex contextual information than simple collections of n-grams usually used in statistical taggers. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. However, better results can be obtained by translating the trees into rules to feed a flexible relaxation labelling based tagger. In this direction we describe a tagger which is able to use information of any kind (n-grams, automatically acquired constraints, linguistically motivated manually written constraints, etc.), and in particular to incorporate the machine learned decision trees. Simultaneously, we address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation.
dc.format.extent29 p.
dc.language.isoeng
dc.relation.ispartofseriesLSI-97-57-R
dc.subjectÀrees temàtiques de la UPC::Informàtica::Informàtica teòrica
dc.subject.otherInductive learning
dc.subject.otherStatistical decision trees
dc.subject.otherRelaxation labelling
dc.subject.otherNatural language processing
dc.subject.otherNLP
dc.subject.otherMorphosyntactic disambiguation
dc.subject.otherPart of speech tagging
dc.subject.otherPOS
dc.titleA Machine learning approach to POS tagging
dc.typeExternal research report
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.rights.accessOpen Access
local.identifier.drac629954
dc.description.versionPostprint (published version)
local.citation.authorMarquez, L.; Padro, L.; Rodriguez, H.


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple