WordFences: Text localization and recognition

Polzounov, Andrei

Visualitza/Obre

122778.pdf (1,131Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Polzounov, Andrei

Tutor / directorEscalera, Sergio; Lu, Shijian

Realitzat a/ambInstitute for Infocomm Research

Tipus de documentProjecte Final de Màster Oficial

Data2017-01

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

In recent years, text recognition has achieved remarkable success in recognizing scanned document text. However, word recognition in natural images is still an open problem, which generally requires time consuming post-processing steps. We present a novel architecture for individual word detection in scene images based on semantic segmentation. Our contributions are twofold: the concept of WordFence, which detects border areas surrounding each individual word and a unique pixelwise weighted softmax loss function which penalizes background and emphasizes small text regions. WordFence ensures that each word is detected individually, and the new loss function provides a strong training signal to both text and word border localization. The proposed technique avoids intensive post-processing by combining semantic word segmentation with a voting scheme for merging segmentations of multiple scales, producing an end-to-end word detection system. We achieve superior localization recall on common benchmark datasets - 92% recall on ICDAR11 and ICDAR13 and 63% recall on SVT. Furthermore, end-to-end word recognition achieves state-of-the-art 86% F-Score on ICDAR13.

Descripció

En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)

MatèriesMachine learning, Neural networks (Computer science), Aprenentatge automàtic, Xarxes neuronals (Informàtica)

TitulacióMÀSTER UNIVERSITARI EN INTEL·LIGÈNCIA ARTIFICIAL (Pla 2012)

URIhttp://hdl.handle.net/2117/101911

Col·leccions

Màsters oficials - Master in Artificial Intelligence - MAI [278]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
122778.pdf		1,131Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

WordFences: Text localization and recognition

Visualitza/Obre

Explora