Skip to main content

Segmentation-Driven Offline Handwritten Chinese and Arabic Script Recognition

  • Conference paper
Arabic and Chinese Handwriting Recognition (SACH 2006)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4768))

Included in the following conference series:

Abstract

The market of handwriting recognition applications is increasing rapidly due to continuous advancement in OCR technology. This paper summarizes our recent efforts on offline handwritten Chinese script recognition using a segmentation-driven approach. We address two essential problems, namely isolated character recognition and establishment of the probabilistic segmentation model. To improve the isolated character recognition accuracy, we propose a heteroscedastic linear discriminant analysis algorithm to extract more discrimination information from original character features, and implement a minimum classification error learning scheme to optimize classifier parameters. In the segmentation stage, information from three different sources, namely geometric layout, character recognition confidence, and semantic model are integrated into a probabilistic framework to give the best script interpretation. Experimental results on postal address and bank check recognition have demonstrated the effectiveness of our proposed algorithms: A more than 80% correct recognition rate is achieved on 1,000 handwritten Chinese address items, and the recognition reliability of bank checks is largely improved after combining courtesy amount recognition result with legal amount recognition result. Some preliminary research work on Arabic script recognition is also shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Senior, A.W., Robinson, A.J.: An off-line cursive handwriting recognition system. IEEE Trans. PAMI 20(3), 309–321 (1998)

    Google Scholar 

  2. Arica, N., Yarman, V.F.T.: An overview of character recognition focused on off-line handwriting. IEEE Trans. on Systems, Man, and Cybernetics—Part C: Applications and Reviews 31(2), 216–233 (2001)

    Article  Google Scholar 

  3. Koerich, A.L., Sabourin, R., Suen, C.Y.: Large vocabulary off-line handwriting recognition: A survey. Pattern Analysis and Applications 6(2), 97–121 (2003)

    Article  MathSciNet  Google Scholar 

  4. Bunke, H.: Recognition of cursive Roman handwriting - past, present and future. In: Proc. of 7th International Conference on Document Analysis and Recognition, pp. 448–459 (2003)

    Google Scholar 

  5. Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. PAMI 18(7), 690–706 (1996)

    Google Scholar 

  6. Lu, Y., Shridhar, M.: Character segmentation in handwritten words - An overview. Pattern Recognition 29(1), 77–96 (1996)

    Article  Google Scholar 

  7. Tseng, L.Y., Chen, R.C.: Segmenting handwritten Chinese characters based on heuristic merging of stroke bounding boxes and dynamic programming. Pattern Recognition Letters 19(8), 963–973 (1998)

    Article  Google Scholar 

  8. Tseng, Y.H., Lee, H.J.: Recognition-based handwritten Chinese character segmentation using a probabilistic viterbi algorithm. Pattern Recognition Letters 20(8), 791–806 (1999)

    Article  Google Scholar 

  9. Gao, J., Ding, X.Q., Wu, Y.S.: A segmentation algorithm for handwritten Chinese character recognition. In: Proc. of 5th International Conference on Document Analysis and Recognition, pp. 633–636 (1999)

    Google Scholar 

  10. Zhao, S.Y., Chi, Z.R., Shi, P.F., Yan, H.: Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36(1), 145–156 (2003)

    Article  MATH  Google Scholar 

  11. Li, Y.X., Ding, X.Q., Tan, C.L., Liu, C.S.: Contextual post-processing based on the confusion matrix in offline handwritten Chinese script recognition. Pattern Recognition 37(9), 1901–1912 (2004)

    Article  MATH  Google Scholar 

  12. Xue, J.L., Ding, X.Q.: Location and interpretation of destination addresses on handwritten Chinese envelopes. Pattern Recognition Letters 22(6), 639–656 (2001)

    Article  MATH  Google Scholar 

  13. Yu, M.L., Kwok, P.C.K., Leung, C.H., et al.: Segmentation and recognition of Chinese bank check amounts. International Journal on Document Analysis and Recognition 3(4), 207–217 (2001)

    Article  Google Scholar 

  14. Liu, C.L., Koga, M., Fujisawa, H.: Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans. PAMI 24(11), 1425–1437 (2002)

    Google Scholar 

  15. Lu, Y., Tan, C.L., Shi, P.F., Zhang, K.H.: Segmentation of handwritten Chinese characters from destination addresses of mail pieces. International Journal of Pattern Recognition and Artificial Intelligence 16(1), 85–96 (2002)

    Article  Google Scholar 

  16. Tang, H.S., Augustin, E., Suen, C.Y., et al.: Recognition of unconstrained legal amounts handwritten on Chinese bank check. In: Proc. of 17th International Conference on Pattern Recognition, pp. 610–613 (2004)

    Google Scholar 

  17. Plamandon, R., Srihari, S.N.: Online and offline handwriting recognition: A comprehensive survey. IEEE Trans. PAMI 22(1), 63–84 (2000)

    Google Scholar 

  18. Suen, C.Y., Mori, S., Kim, S.H., Leung, C.H.: Analysis and recognition of Asian scripts - the state of the art. In: Proc. of 7th International Conference on Document Analysis and Recognition, pp. 866–878 (2003)

    Google Scholar 

  19. Yamada, H., Yamamoto, K., Saito, T.: A nonlinear normalization method for handprinted Kanji character recognition – line density equalization. Pattern Recognition 23(9), 1023–1029 (1990)

    Article  Google Scholar 

  20. Liu, C.L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition: investigation of normalization and feature extraction techniques. Pattern Recognition 37(2), 265–279 (2004)

    Article  MATH  Google Scholar 

  21. Liu, H.L., Ding, X.Q.: Handwritten character recognition using gradient feature and quadratic classifier with multiple discrimination schemes. In: Proc. of 8th International Conference on Document Analysis and Recognition, pp. 19–25 (2005)

    Google Scholar 

  22. Kimura, F., Takashina, K., Tsuruoka, S., Miyake, Y.: Modified quadratic discriminant functions and its application to Chinese character recognition. IEEE Trans. PAMI 9(1), 149–153 (1987)

    Google Scholar 

  23. Liu, H.L., Ding, X.Q.: Improve handwritten character recognition performance by Heteroscedastic linear discriminant analysis. In: Proc. of 18th International Conference on Pattern Recognition, vol. 1, pp. 880–883 (2006)

    Google Scholar 

  24. Loog, M., Duin, R.P.W., Haeb-Umbach, R.: Multiclass linear dimension reduction by weighted pairwise fisher criteria. IEEE Trans. PAMI 23(7), 762–766 (2001)

    Google Scholar 

  25. Loog, M., Duin, R.P.W.: Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion. IEEE Trans. PAMI 26(6), 732–739 (2004)

    Google Scholar 

  26. Juang, B.H., Katagiri, S.: Discriminative learning for minimum error classification. IEEE Trans. on Signal Processing 40(12), 3043–3054 (1992)

    Article  MATH  Google Scholar 

  27. Katagiri, S., Juang, B.H., Lee, C.H.: Pattern recognition using a family of design algorithms based upon the generalized probability descent method. Proceedings of the IEEE 86(11), 2345–2373 (1998)

    Article  Google Scholar 

  28. Watanabe, H., Katagiri, S.: Subspace method for minimum error pattern recognition. IEICE Trans. on Information and System E80-D(12), 1095–1104 (1997)

    Google Scholar 

  29. Zhang, R., Ding, X.Q., Zhang, J.Y.: Offline handwritten character recognition based on discriminative training of orthogonal Gaussian mixture model. In: Proc. of 6th International Conference on Document Analysis and Recognition, pp. 221–225 (2001)

    Google Scholar 

  30. Liu, C.L., Sako, H., Fujisawa, H.: Discriminative learning quadratic discriminant function for handwriting recognition. IEEE Trans. on Neural Networks 15(2), 430–444 (2004)

    Article  Google Scholar 

  31. Tseng, L.Y., Chuang, C.T.: An efficient knowledge based stoke extraction method for multi-font Chinese characters. Pattern Recognition 25(12), 1445–1458 (1992)

    Article  Google Scholar 

  32. Wang, R., Ding, X.Q., Liu, C.S.: Handwritten Chinese address segmentation and recognition based on merging strokes. Journal of Tsinghua Univ (Sci & Tech) 44(4), 498–502 (2004)

    Google Scholar 

  33. Liu, C.L., Nakagawa, M.: Precise candidate selection for large character set recognition by confidence evaluation. IEEE Trans. PAMI 22(6), 636–642 (2000)

    Google Scholar 

  34. Fu, Q., Ding, X.Q., Liu, C.S., Jiang, Y., Ren, Z.: A Hiddern Markov Model based segmentation and recognition algorithm for Chinese handwritten address character strings. In: Proc. of 8th International Conference on Document Analysis and Recognition, pp. 590–594 (2005)

    Google Scholar 

  35. Jiang, Y., Ding, X.Q., Fu, Q., Ren, Z.: Application of Bi-gram driven Chinese handwritten character segmentation for an address reading system. In: 7th International Workshop on Document Analysis Systems, pp. 220–231 (2006)

    Google Scholar 

  36. Olivier, C., Miled, H., et al.: Segmentation and coding of Arabic handwritten words. In: Proc. of 13th International Conference on Pattern Recognition, pp. 264–268 (1996)

    Google Scholar 

  37. Cheung, A., Bennamoun, M., Bergmann, N.W.: An Arabic optical character recognition system using recognition-based segmentation. Pattern Recognition 34(2), 215–233 (2001)

    Article  MATH  Google Scholar 

  38. Xiu, P.P., Peng, L.R., Ding, X.Q., Wang, H.: Offline handwritten Arabic character segmentation with probabilistic model. In: Proc. of 7th International Workshop on Document Analysis Systems, pp. 402–412 (2006)

    Google Scholar 

  39. Jin, J.M., Wang, H., Ding, X.Q., Peng, L.R.: Printed Arabic document recognition system. In: Latecki, L.J., Mount, D.M., Wu, A.Y. (eds.) Vision Geometry XIII. Proceedings of the SPIE 5676, pp. 48–55 (2004)

    Google Scholar 

  40. Jiang, Y., Ding, X.Q., Ren, Z.: Substring alignment method for lexicon based handwritten Chinese string recognition and its application to address line recognition. In: Proc. of 18th International Conference on Pattern Recognition, vol. 2, pp. 683–686 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

David Doermann Stefan Jaeger

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ding, X., Liu, H. (2008). Segmentation-Driven Offline Handwritten Chinese and Arabic Script Recognition. In: Doermann, D., Jaeger, S. (eds) Arabic and Chinese Handwriting Recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78199-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78199-8_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78198-1

  • Online ISBN: 978-3-540-78199-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics