Show simple item record

dc.contributor.authorCatala Roig, Neus
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.identifier.citationCatalà Roig, N. "ESSENCE: a portable methodology for building information extraction systems". 1998.
dc.description.abstractOne of the most important issues when constructing an Information Extraction System is how to obtain the knowledge needed for identifying relevant information in a document. A manual approach not only is an expensive solution but also has a negative effect on the portability of the system across domains. To automatize the knowledge acquisition process may partially solve this problem even if a human expert takes part in it only for specific tasks. This work presents a methodology ({sc Essence}) to automatically learn information extraction patterns from unrestricted text corpus representative of the domain. The methodology includes different steps from which we stress the specific pattern generalization process. Generalization reduces the pattern base and therefore reduces the amount of information to validate by an expert. As we will see, the use of the lexical knowledge along with the lexico-semantic relations from WordNet are our basis knowledge source, especially, for the generalization process.
dc.format.extent15 p.
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació
dc.subject.otherInformation extraction
dc.subject.otherInformation extraction systems portability
dc.subject.otherLearning information extraction patterns
dc.titleESSENCE: a portable methodology for building information extraction systems
dc.typeExternal research report
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.rights.accessOpen Access
dc.description.versionPostprint (published version)
local.citation.authorCatalà Roig, N.

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder