ISSN ONLINE(2320-9801) PRINT (2320-9798)
Detection of Text with Connected Component Clustering
Text detection and recognition is a hot topic for researchers in the field of image processing. It gives attention to Content based Image Retrieval (CBIR) community in order to fill the semantic gap between low level and high level features. Several methods have been developed for text detection and extraction that achieve reasonable accuracy for natural scene text (camera images) as well as multi-oriented text. However, it is noted that most of the methods use classifier and large number of training samples to improve the text detection accuracy. The multi-orientation problem can be solved using the connected component analysis method. To extract connected components (CCs) in images by using the maximally stable extremal region algorithm. These extracted CCs are partitioned into clusters so that we can generate candidate regions. Trained an AdaBoost classifier that determines the adjacency relationship and cluster CCs by using their pairwise relations. The scale, skew, and color of each candidate can be estimated from CCs, and develop a text/non text classifier for normalized images. This classifier is based on multilayer perceptrons and we can control recall and precision rates with a single free parameter. Finally, we extend our approach to exploit multichannel information. Experimental results on ICDAR 2005 and 2011 robust reading competition datasets show that our method yields the state-of-the-art performance both in speed and accuracy.
B.Nishanthi, S. Shahul Hammed
To read the full article Download Full Article