WordFences: Text localization and recognition
CovenanteeInstitute for Infocomm Research
Document typeMaster thesis
Rights accessOpen Access
In recent years, text recognition has achieved remarkable success in recognizing scanned document text. However, word recognition in natural images is still an open problem, which generally requires time consuming post-processing steps. We present a novel architecture for individual word detection in scene images based on semantic segmentation. Our contributions are twofold: the concept of WordFence, which detects border areas surrounding each individual word and a unique pixelwise weighted softmax loss function which penalizes background and emphasizes small text regions. WordFence ensures that each word is detected individually, and the new loss function provides a strong training signal to both text and word border localization. The proposed technique avoids intensive post-processing by combining semantic word segmentation with a voting scheme for merging segmentations of multiple scales, producing an end-to-end word detection system. We achieve superior localization recall on common benchmark datasets - 92% recall on ICDAR11 and ICDAR13 and 63% recall on SVT. Furthermore, end-to-end word recognition achieves state-of-the-art 86% F-Score on ICDAR13.
En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder