Scene text localization and recognition (also known as text localization and recognition in real-world images, nature scene OCR or text-in-the-wild problem) is an open problem, attracting increasing interest from researchers. In this paper, we address the localization issue and leave the recognition part out of its scope. For the purpose of scene text localization, Scale-Invariant Feature Transform (SIFT) keypoints are extracted from the images and classified as text and non-text.
Subsequently, the text keypoints are utilized to compute the bounding boxes around text regions. The proposed technique is tested on the database of ICDAR 2013 Robust Reading Competition – Challenge 2 and the experimental results are reported in detail. Although the idea introduced here is still at its infancy, it is observed to achieve remarkable results and due to the fact that there is a large room for improvement, it is found to be promising.