dc.contributor.author | Sabir, Ahmed |
dc.contributor.author | Moreno-Noguer, Francesc |
dc.contributor.author | Padró, Lluís |
dc.contributor.other | Institut de Robòtica i Informàtica Industrial |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament de Ciències de la Computació |
dc.date.accessioned | 2020-04-03T11:22:33Z |
dc.date.available | 2020-04-03T11:22:33Z |
dc.date.issued | 2018 |
dc.identifier.citation | Sabir, A.; Moreno-Noguer, F.; Padro, L. Enhancing text spotting with a language model and visual context information. A: Congrés Internacional de l’Associació Catalana d’Intel·ligència Artificial. "Artificial intelligence research and development : proceedings of the 21th International Conference of the Catalan Association for Artificial Intelligence". IOS Press, 2018, p. 271-280. |
dc.identifier.isbn | 978-1-61499-917-1 |
dc.identifier.uri | http://hdl.handle.net/2117/183084 |
dc.description.abstract | This paper addresses the problem of detecting and recognizing text in images acquired ‘in the wild’. This is a severely under-constrained problem which needs to tackle a number of challenges including large occlusions, changing light- ing conditions, cluttered backgrounds and different font types and sizes. In order to address this problem we leverage on recent and successful developments in the cross-fields of machine learning and natural language understanding. In particular, we initially rely on off-the-shelf deep networks already trained with large amounts of data and that provide a series of text hypotheses per input image. The outputs of this network are then combined with different priors obtained from both the se- mantic interpretation of the image and from a scene-based language model. As a result of this combination, the performance of the original network is consistently boosted. We validate our approach on ICDAR’17 shared task dataset. |
dc.format.extent | 10 p. |
dc.language.iso | eng |
dc.publisher | IOS Press |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Spain |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Automàtica i control |
dc.subject.other | Computer vision |
dc.title | Enhancing text spotting with a language model and visual context information |
dc.type | Conference report |
dc.contributor.group | Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI |
dc.contributor.group | Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural |
dc.identifier.doi | 10.3233/978-1-61499-918-8-271 |
dc.description.peerreviewed | Peer Reviewed |
dc.subject.inspec | Classificació INSPEC::Pattern recognition::Computer vision |
dc.relation.publisherversion | http://ebooks.iospress.nl/publication/50417 |
dc.rights.access | Open Access |
local.identifier.drac | 23845320 |
dc.description.version | Postprint (author's final draft) |
local.citation.author | Sabir, A.; Moreno-Noguer, F.; Padro, L. |
local.citation.contributor | Congrés Internacional de l’Associació Catalana d’Intel·ligència Artificial |
local.citation.publicationName | Artificial intelligence research and development : proceedings of the 21th International Conference of the Catalan Association for Artificial Intelligence |
local.citation.startingPage | 271 |
local.citation.endingPage | 280 |