Show simple item record

dc.contributor.authorKoci, Elvis
dc.contributor.authorThiele, Maik
dc.contributor.authorLehner, Wolfgang
dc.contributor.authorRomero Moral, Óscar
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació
dc.identifier.citationKoci, E. [et al.]. Table recognition in spreadsheets via a graph representation. A: IAPR International Workshop on Document Analysis Systems. "Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018". 2018, p. 139-144.
dc.description.abstractSpreadsheet software are very popular data management tools. Their ease of use and abundant functionalities equip novices and professionals alike with the means to generate, transform, analyze, and visualize data. As a result, spreadsheets are a great resource of factual and structured information. This accentuates the need to automatically understand and extract their contents. In this paper, we present a novel approach for recognizing tables in spreadsheets. Having inferred the layout role of the individual cells, we build layout regions. We encode the spatial interrelations between these regions using a graph representation. Based on this, we propose Remove and Conquer (RAC), an algorithm for table recognition that implements a list of carefully curated rules. An extensive experimental evaluation shows that our approach is viable. We achieve significant accuracy in a dataset of real spreadsheets from various domains. © 2018 IEEE.
dc.format.extent6 p.
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació
dc.subject.lcshSpreadsheet software
dc.subject.otherTable Identification
dc.subject.otherTable Recognition
dc.subject.otherInformation management
dc.subject.otherData management tools
dc.subject.otherExperimental evaluation
dc.subject.otherGraph representation
dc.subject.otherRule based
dc.subject.otherSpreadsheet software
dc.subject.otherStructured information
dc.subject.otherTable Recognition
dc.titleTable recognition in spreadsheets via a graph representation
dc.typeConference report
dc.subject.lemacFull de càlcul
dc.contributor.groupUniversitat Politècnica de Catalunya. IMP - Information Modeling and Processing
dc.description.peerreviewedPeer Reviewed
dc.rights.accessOpen Access
dc.description.versionPostprint (author's final draft)
upcommons.citation.authorKoci, E.; Thiele, M.; Lehner, W.; Romero, O.
upcommons.citation.contributorIAPR International Workshop on Document Analysis Systems
upcommons.citation.publicationNameProceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder