Text embedded in an image contains useful information for applications in the medical, industrial, commercial, and research fields. While many systems have been designed to correctly identify text in images, no work addressing the recognition of degraded text on clear plastic has been found. This paper posits novel methods and an apparatus for extracting text from an image with the practical assumption: (a) poor background contrast, (b) white, curved, and/or differing fonts or character width between sets of images, (c) dotted text printed on curved reflective material, and/or (d) touching characters. Methods were evaluated using a total of 100 unique test images containing a variety of texts captured from water bottles. These tests averaged a processing time of ~10 seconds (using MATLAB R2008A on an HP 8510?W with 4?G of RAM and 2.3?GHz of processor speed), and experimental results yielded an average recognition rate of 90 to 93% using customized systems generated by the proposed development. 1. Introduction Recognition of degraded characters is a challenging problem in the field of image processing and optical character recognition (OCR). The accuracy and the efficiency of OCR applications are dependent upon the quality of the input image ?[1–3]. Security applications and data processing have increased the interest in this area dramatically. Therefore, the ability to replicate and distribute extracted data becomes more important ?[4, 5]. In [6], Jung et al. presented a survey of the term text information extraction (TIE) within an image, by assuming that there is no prior knowledge such as location, orientation, number of characters, font, color, and so on. They also noted that: (a) the variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely challenging; (b) variety of approaches to TIE from images and video have been proposed for specific applications, such as page segmentation, address block location, license plate location, and content-based image/video indexing. In spite of such extensive studies, it still remains laborious to design a general-purpose OCR system [5] for the reason that there is an abundance of possible sources of variation when extracting text from a shaded or textured background, from low-contrast or complex images, or from images having variations in font size, style, color, and orientation [6]. These variations make the problem of automatic TIE extremely arduous and convoluted. Recently, many
References
[1]
F. Idris and S. Panchanathan, “Review of image and video indexing techniques,” Journal of Visual Communication and Image Representation, vol. 8, no. 2, pp. 146–166, 1997.
[2]
H. Li, D. Doermann, and O. Kia, “Automatic text detection and tracking in digital video,” IEEE Transactions on Image Processing, vol. 9, no. 1, pp. 147–156, 2000.
[3]
R. Cattoni, T. Coianiz, S. Messoldi, and C. M. Modena, “Geometric layout analysis techniques for document image understanding a review,” ITC-IRST Technical Report #9703-09, 1998.
[4]
B. A. Yanikoglu, “Pitch-based segmentation and recognition of dot matrix text,” International Journal on Document Analysis and Recognitionno, vol. 3, no. 1, pp. 34–39, 2000.
[5]
H. Liu, M. Wu, G. F. Jin, and Y. Yan, “A post processing algorithm for the optical recognition of degraded characters,” in Document Recognition and Retrieval VI, vol. 3651 of Proceedings of SPIE, pp. 41–48, The International Society for Optical Engineering, San Jose, Calif, USA, January 1999.
[6]
K. Jung, K. I. Kim, and A. K. Jain, “Text information extraction in images and video: a survey,” Pattern Recognition, vol. 37, no. 5, pp. 977–997, 2004.
[7]
S. Choi, J. P. Yun, and S. W. Kim, “Text localization and character segmentation algorithms for automatic recognition of slab identification numbers,” Optical Engineering, vol. 48, no. 3, Article ID 037206, 2009.
[8]
K. Wang and J. A. Kangas, “Character location in scene images from digital camera,” Pattern Recognition, vol. 36, no. 10, pp. 2287–2299, 2003.
[9]
X. Chen, J. Yang, J. Zhang, and A. Waibel, “Automatic detection and recognition of signs from natural scenes,” IEEE Transactions on Image Processing, vol. 13, no. 1, pp. 87–99, 2004.
[10]
Y. Liu, S. Goto, and T. Ikenaga, “A contour-based robust algorithm for text detection in color images,” IEICE Transactions on Information and Systems, vol. E89-D, no. 3, pp. 1221–1230, 2006.
[11]
B. Zhu and M. Nakagawa, “Segmentation of on-line handwritten Japanese text of arbitrary line direction by a neural network for improving text recognition,” in Proceedings of the 8th International Conference on Document Analysis and Recognition, vol. 1, pp. 157–161, September 2005.
[12]
X. Liu, H. Fu, and Y. Jia, “Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images,” Pattern Recognition, vol. 41, no. 2, pp. 484–493, 2008.
[13]
M. A. El-Shayeb, S. R. El-Beltagy, and A. Rafea, “Comparative analysis of different text segmentation algorithms on Arabic news stories,” in Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI '07), pp. 441–446, August 2007.
[14]
D. R. R. Babu, M. Ravishankar, M. Kumar, K. Wadera, and A. Raj, “Degraded character recognition based on gradient pattern,” in The 2nd International Conference on Digital Image Processing, Proceedings of SPIE, February 2010.
[15]
R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, Upper Saddle River, NJ, USA, 2nd edition, 2002.
[16]
E. A. Silva, K. Panetta, and S. S. Agaian, “Quantifying image similarity using measure of enhancement by entropy,” in Mobile Multimedia/Image Processing for Military and Security Applications, vol. 6579 of Proceedings of the SPIE, April 2007, Paper #6579-32.
[17]
E. Wharton, K. Panetta, and S. Agaian, “Human visual system based similarity metrics,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '08), pp. 685–690, October 2008.
[18]
C. Fang and J. J. Hull, “A Modified character-level deciphering algorithm for OCR in degraded documents,” in IS&T Conference on Document Recognition II, vol. 2422 of Proceedings of SPIE, pp. 76–83, March 1999.
[19]
E. Y. Kim, K. Jung, K. Y. Jeong, and H. J. Kim, “Automatic text region extraction using cluster-based templates,” in Proceedings of the of International Conference on Advances in Pattern Recognition and Digital Techniques, pp. 418–421, 2000.
[20]
L. Likforman-Sulem and M. Sigelle, “Recognition of broken characters from historical printed books using Dynamic Bayesian Networks,” in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '00), 2000.
[21]
M. Yokobayashi and T. Wakahara, “Binarization and recognition of degraded characters using a maximum separability axis in color space and GAT correlation,” in Proceedings of the 18th International Conference on Pattern Recognition (ICPR '06), pp. 885–888, August 2006.
[22]
D. J. Granrath, “The role of human visual models in image processing,” Proceedings of the IEEE, vol. 69, no. 5, pp. 552–561, 1981.
[23]
A. Cusmariu, “Method of extracting text present in a color image,” United State Patent, patent no 6519362b1, 2009.
[24]
S. Liang, M. Ahmadi, and M. Shridhar, “Segmentation of handwritten interference marks using multiple directional stroke planes and reformalized morphological approach,” IEEE Transactions on Image Processing, vol. 6, no. 8, pp. 1195–1202, 1997.
[25]
Y. K. Chen and J. F. Wang, “Segmentation of single- or multiple-touching handwritten numeral string using background and foreground analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1304–1317, 2000.
[26]
E. Wharton, S. Agaian, and K. Panetta, “Comparative study of logarithmic enhancement algorithms with performance measure,” in Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning, vol. 6064 of Proceedings of SPIE, January 2006, Paper #6064-12.
[27]
S. S. Agaian, K. Panetta, and A. M. Grigoryan, “Transform-based image enhancement algorithms with performance measure,” IEEE Transactions on Image Processing, vol. 10, no. 3, pp. 367–382, 2001.
[28]
P. Xiang, Y. Xiuzi, and Z. Sanyuan, “A hybrid method for robust car plate character recognition,” Engineering Applications of Artificial Intelligence, vol. 18, no. 8, pp. 963–972, 2005.
[29]
L. Xu, A. Krzyzak, and C. Y. Suen, “Methods of combining multiple classifiers and their applications to handwriting recognition,” IEEE Transactions on Systems, Man and Cybernetics, vol. 22, no. 3, pp. 418–435, 1992.
[30]
D. Chen, J. Luettin, and K. Shearer, “A survey of text detection and recognition in images and videos,” Institut Dalle Molled'Intelligence Artificielle Perceptive (IDIAP) Research Report IDIAP-RR, 2008.
[31]
S. N. Srihari, Y. C. Shin, V. Ramanaprasad, and D. S. Lee, “A system to read names and addresses on tax forms,” Proceedings of the IEEE, vol. 84, no. 7, pp. 1038–1049, 1996.
[32]
S. Gopisetty, R. Lorie, J. Mao, M. Mohiuddin, A. Sorin, and E. Yair, “Automated forms-processing software and services,” IBM Journal of Research and Development, vol. 40, no. 2, pp. 211–229, 1996.
[33]
N. Gorski, V. Anisimov, E. Augustin, O. Baret, D. Price, and J. C. Simon, “A2iA Check Reader: a family of bank check recognition systems,” in Proceedings of the 5th International Conference on Document Analysis and Recognition, 1999.
[34]
K. Mohammad, S. Agaian, and F. Hudson, “Implementation of Digital Electronic Arithmetics and its application in image processing,” Computers and Electrical Engineering, vol. 36, no. 3, pp. 424–434, 2010.
[35]
G. Deng, L. W. Cahill, and G. R. Tobin, “Study of logarithmic image processing model and its application to image enhancement,” IEEE Transactions on Image Processing, vol. 4, no. 4, pp. 506–512, 1995.
[36]
S. S. Agaian, “Visual morphology,” in Nonlinear Image Processing X, vol. 3646 of Proceedings of SPIE, pp. 139–150, January 1999.
[37]
B. Wang, X. F. Li, F. Liu, and F. Q. Hu, “Color text image binarization based on binary texture analysis,” Pattern Recognition Letters, vol. 26, no. 10, pp. 1568–1576, 2005.
[38]
C. Thillou and B. Gosselin, “Color binarization for complex camera-based images,” in Proceedings of the Electronic Imaging Conference of the International Society for Optical Imaging, pp. 301–308, January 2005.
[39]
M. Unser, “Splines: a perfect fit for medical imaging,” in International Symposium on Medical Imaging: Image Processing (MI' 02), Proceedings of the SPIE, pp. 225–236, San Diego, Calif, USA, February 2002.
M. Pechwitz and V. Maergner, “Baseline estimation for Arabic handwritten words,” in Proceddings of the 8th International Workshop of Frontiers in Handwriting Recognition (IWFHR '02), August 2002.
[43]
N. Kilic, P. Gorgel, O. N. Ucan, and A. Kala, “Multifont Ottoman character recognition using Support Vector Machine,” in Proceedings of the 3rd International Symposium on Communications, Control, and Signal Processing (ISCCSP '08), pp. 328–333, March 2008.
[44]
Y. J. Song, K. C. Kim, Y. W. Choi et al., “Text region extraction and text segmentation on camera-captured document style images,” in Proceedings of the Eight International Conference on Document Analysis and Recognition, Seoul, Korea, August 2005.
[45]
S. Sharma, Extraction of Text Regions in Natural Images, Rochester Institute of Technology, Rochester, NY, USA, 2007.
[46]
D. Chen, H. Bourlard, and J. P. Thiran, “Text identification in complex background using SVM,” in Proceedings of the IEEE International Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II621–II626, December 2001.
[47]
Z. Saidani, Image and Video Text Recognition Using Convolutional Neural Networks [Ph.D. thesis], LAP Lambert Academic, Saarbrücken, Germany, 2008.
[48]
V. Ganapathy and L. W. L. Dennis, “Malaysian vehicle license plate localization and recognition system,” Journal of Systemics, Cybernetics and Informatics, vol. 6, no. 1, 2008.
[49]
X. Li, W. Wang, Q. Huang, W. Gao, and L. Qing, “A hybrid text segmentation approach,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '09), pp. 510–513, July 2009.
[50]
Q. Ye, W. Gao, and Q. Huang, “Automatic text segmentation from complex background,” in Proceedings of the International Conference on Image Processing (ICIP '04), pp. 2905–2908, October 2004.
[51]
J. Gllavata, E. Qeli, and B. Freisleben, “Detecting text in videos using fuzzy clustering ensembles,” in Proceedings of the 8th IEEE International Symposium on Multimedia (ISM '06), pp. 283–290, December 2006.