Abstract
This paper mainly focuses on the recognition of both simple and conjunct handwritten characters in Malayalam, a South Indian language. The algorithm proposed recognizes these characters mainly based on the strokes and lines contained in them. Here the input is an image of handwritten Malayalam characters, which undergoes different phases of processing to produce an editable document of Malayalam characters in a predefined format as output. In this paper, detailed description of the methods for character identification is given. The whole OCR process is presented in three different modules: Pre-processing, Skeletonization and Recognition. In Pre-processing, the input image is scanned and subjected to line and character separation. In Skeletonization, the digital image is transformed into a set of original components. In Recognition, the characters are classified based on their features. The feature extraction of the characters is done by the analyzing the position and count of the horizontal and vertical lines. A classification of the simple and conjunct characters is also devised based on the count and position of the horizontal and vertical lines which make up those characters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Trier, D., Jain, A.K., Taxt, T.: Feature Extraction methods for Character Recognition – A Survey. Pattern Recognition 29, 641–662 (1996)
Srihari, S.N., Yang, X., Ball, G.R.: Offline Chinese Handwriting Recognition: an assessment of current Technology. Front. Computer Science 1(2), 137–155 (2007)
Amin, A.: Recognition of Printed Arabic Text based on global features and Decision Tree Learning Techniques. Pattern Trcognition 33(8), 1309–1323 (2000)
Pal, U., Chaudhuri, B.B.: Printed Devanagari script OCR System. Vivek  10 (1997)
Chaudhuri, B., Pal, U., Mitra, U.: Automatic recognition of printed Oriya script. Sadhana 27, Part 1, 23–34 (2002)
Seethalakshmi, R., Sreeranjani, T.R., Balachandar, T., Singh, A., Singh, M., Ratan, R., Kumar, S.: Optical Character Recognition for printed Tamil text using Unicode. Journal of Zhejiang University SCI 6A(11), 1297–1305 (2005)
Lakshmi, C.V., Patvardhan, C.: A multi-font OCR system for printed Telugu text. In: Proc. of Language Engineering Conference LEC, Hyderabad, pp. 7–17 (2002)
Ashwin, T.V., Sastry, P.S.: A font and size independent OCR system for printed Kannada documents using support vector machines. Saadhana 27, Part 1, 35–58 (2002)
Abdul Rahiman, M., Rajasree, M.S.: Printed Malayalam Character Recognition Using Back propagation Neural Networks. In: Proc. of IEEE International Advance Computing Conference (IACC 2009), Patiala, pp. 1140–1144 (March 2009)
Journal of Language Technology, Viswabharat@tdil (July 2003)
Anuradha, Koteswarra, B.: An efficient Binarization technique for old documents. In: Proc. of International Conference on Systemics, Cybernetics and Inforrmatics, Hyderabad, pp. 771–775 (2006)
Chaudhuri, B.B., Pal, U.: Skew Angle Detection of Digitized Indian Script Document. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(2) (February 1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rahiman, M.A., Rajasree, M.S. (2011). Recognition of Simple and Conjunct Handwritten Malayalam Characters Using LCPA Algorithm. In: Abraham, A., Mauri, J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22720-2_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-22720-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22719-6
Online ISBN: 978-3-642-22720-2
eBook Packages: Computer ScienceComputer Science (R0)