Show simple item record

dc.contributor.authorSabir, Ahmed
dc.contributor.authorMoreno-Noguer, Francesc
dc.contributor.authorPadró, Lluís
dc.contributor.otherInstitut de Robòtica i Informàtica Industrial
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.date.accessioned2020-04-03T11:22:33Z
dc.date.available2020-04-03T11:22:33Z
dc.date.issued2018
dc.identifier.citationSabir, A.; Moreno-Noguer, F.; Padro, L. Enhancing text spotting with a language model and visual context information. A: Congrés Internacional de l’Associació Catalana d’Intel·ligència Artificial. "Artificial intelligence research and development : proceedings of the 21th International Conference of the Catalan Association for Artificial Intelligence". IOS Press, 2018, p. 271-280.
dc.identifier.isbn978-1-61499-917-1
dc.identifier.urihttp://hdl.handle.net/2117/183084
dc.description.abstractThis paper addresses the problem of detecting and recognizing text in images acquired ‘in the wild’. This is a severely under-constrained problem which needs to tackle a number of challenges including large occlusions, changing light- ing conditions, cluttered backgrounds and different font types and sizes. In order to address this problem we leverage on recent and successful developments in the cross-fields of machine learning and natural language understanding. In particular, we initially rely on off-the-shelf deep networks already trained with large amounts of data and that provide a series of text hypotheses per input image. The outputs of this network are then combined with different priors obtained from both the se- mantic interpretation of the image and from a scene-based language model. As a result of this combination, the performance of the original network is consistently boosted. We validate our approach on ICDAR’17 shared task dataset.
dc.format.extent10 p.
dc.language.isoeng
dc.publisherIOS Press
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Automàtica i control
dc.subject.otherComputer vision
dc.titleEnhancing text spotting with a language model and visual context information
dc.typeConference report
dc.contributor.groupUniversitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.identifier.doi10.3233/978-1-61499-918-8-271
dc.description.peerreviewedPeer Reviewed
dc.subject.inspecClassificació INSPEC::Pattern recognition::Computer vision
dc.relation.publisherversionhttp://ebooks.iospress.nl/publication/50417
dc.rights.accessOpen Access
local.identifier.drac23845320
dc.description.versionPostprint (author's final draft)
local.citation.authorSabir, A.; Moreno-Noguer, F.; Padro, L.
local.citation.contributorCongrés Internacional de l’Associació Catalana d’Intel·ligència Artificial
local.citation.publicationNameArtificial intelligence research and development : proceedings of the 21th International Conference of the Catalan Association for Artificial Intelligence
local.citation.startingPage271
local.citation.endingPage280


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Spain
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 3.0 Spain