Skip to main content
Log in

Script pattern identification of word images using multi-directional and multi-scalable textures

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

As a precursor of optical character recognition (OCR) technology, script identification finds many applications like sorting and indexing of document images. Classifying these scripts, especially at different scales and orientations, is one of the interesting and vital problems in the field of document image analysis. In this paper, an algorithm is proposed for the identification of scripts using scale and rotation robust log-polar wavelet and semi decimated wavelet features. Initially, words are segmented from document images in the form of text-blobs by the Gaussian filter. Then, texture features are calculated using a combination of discrete wavelet and semi decimated discrete wavelet transforms in log-polar domain. Here, most of the rotational and scale variations are removed in log-polar domain, whereas wavelet transform is capable of extracting the information at different resolution levels. This helps in the formation of significant textures for the purpose of characterization. At last, k-nearest neighbor classifier is used for the identification of scripts. Comprehensive experiments on different databases illustrate the effectiveness of the proposed algorithm. Benchmarking analysis shows that a maximum recall rate of 98.96% is obtained, and demonstrates better performance compared to the other contemporary approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Ahamed P, Kundu S, Khan T, Bhateja V, Sarkar R, Mollah AF (2020) Handwritten Arabic numerals recognition using convolutional neural network. J Ambient Intell Hum Comput 11:5445–5457

    Article  Google Scholar 

  • ALPH-REGIM Database. http://www.regim.org/database/alph.html, http://ewh.ieee.org/r8/tunisia/regim/alph_regim/.

  • Behrad A, Khoddami M, Salehpour M (2010) A novel framework for farsi and latin script identification and farsi handwritten digit recognition. J Autom Control 20:17–25

    Article  Google Scholar 

  • Brodić D, Milivojević ZN, Maluckov ČA (2015) An approach to the script discrimination in the Slavic documents. Soft Comput 19:2655–2665

    Article  Google Scholar 

  • Busch A, Boles WW (2002) Texture classification using wavelet scale relationships. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, pp IV-3584-IV-3587

  • Busch A, Boles WW, Sridharan S (2004) Logarithmic quantisation of wavelet coefficients for improved texture classification performance. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, pp iii-569

  • Busch A, Boles WW, Sridharan S (2005) Texture for script identification. IEEE Trans Pattern Anal Mach Intell 27:1720–1732

    Article  Google Scholar 

  • Chun YD, Seo SY, Kim NC (2003) Image retrieval using BDIP and BVLC moments. IEEE Trans Circuits Syste Video Technol 13:951–957

    Article  Google Scholar 

  • Ghosh S, Chaudhuri BB (2011) Composite script identification and orientation detection for indian text images. In: 2011 International Conference on Document Analysis and Recognition. IEEE, pp 294–298

  • Ghosh D, Dube T, Shivaprasad A (2010) Script recognition—a review. IEEE Trans Pattern Anal Mach Intell 32:2142–2161

    Article  Google Scholar 

  • Haboubi S, Maddouri SS, Amiri H (2011) Separation between Arabic and Latin scripts from bilingual text using structural features. In: International Conference on Integrated Computing Technology. Springer, pp 132–143

  • Hangarge M, Santosh K, Pardeshi R (2013) Directional discrete cosine transform for handwritten script identification. In: 2013 12th International Conference on Document Analysis and Recognition. IEEE, pp 344–348

  • Haralick RM, Watson L (1981) A facet model for image data. Comput Graph Image Process 15:113–129

    Article  Google Scholar 

  • Hochberg J, Bowers K, Cannon M, Kelly P (1999) Script and language identification for handwritten document images. Int J Doc Anal Recogn 2:45–52

    Article  Google Scholar 

  • Hu H (2014) Illumination invariant face recognition based on dual-tree complex wavelet transform. IET Comput Vision 9:163–173

    Article  Google Scholar 

  • Huang G-B, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybernet Part B (Cybernetics) 42:513–529

    Article  Google Scholar 

  • Jindal M, Hemrajani N (2013) Script identification for printed document images at text-line level using DCT and PCA IOSR. J Comput Eng 12:97–102

    Google Scholar 

  • Joshi GD, Garg S, Sivaswamy J (2007) A generalised framework for script identification. Int J Document Anal Recogn (IJDAR) 10:55–68

    Article  Google Scholar 

  • Kacem A, Saidani A, Belaid A (2014) How to separate between machine-printed/handwritten and arabic/latin words? ELCVIA Electron Lett Comput Vision Image Anal 13:1–17

    Article  Google Scholar 

  • Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20:226–239

    Article  Google Scholar 

  • Kolekar M (2002) An algorithm for designing optimal Gabor filter for segmenting multi-textured images. IETE J Res 48:181–187

    Article  Google Scholar 

  • Kong H, Akakin HC, Sarma SE (2013) A generalized Laplacian of Gaussian filter for blob detection and its applications. IEEE Trans Cybernet 43:1719–1733

    Article  Google Scholar 

  • Lee WS, Kim NC, Jang IH (2010) Texture feature-based language identification using wavelet-domain BDIP, BVLC, and NRMA features. In: 2010 IEEE International Workshop on Machine Learning for Signal Processing. IEEE, pp 444–449

  • Li S, Shen Q, Sun J (2007) Skew detection using wavelet decomposition and projection profile analysis. Pattern Recogn Lett 28:555–562

    Article  Google Scholar 

  • Li J, Mei X, Prokhorov D, Tao D (2016) Deep neural network for structural prediction and lane detection in traffic scene. IEEE Trans Neural Networks Learn Syst 28:690–703

    Article  Google Scholar 

  • Luo X-Q, Zhang Z-C, Zhang B-C, Wu X-J (2017) Contextual information driven multi-modal medical image fusion. IETE Tech Rev 34:598–611

    Article  Google Scholar 

  • Mahmoud SA (1994) Arabic character recognition using Fourier descriptors and character contour encoding. Pattern Recogn 27:815–824

    Article  Google Scholar 

  • Manmatha R, Rothfeder JL (2005) A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans Pattern Anal Mach Intell 27:1212–1225

    Article  Google Scholar 

  • Mao W, Chung F-l, Lam KK, Sun W-C (2002) Hybrid Chinese/English text detection in images and video frames. In: Object recognition supported by user interaction for service robots. IEEE, pp 1015–1018

  • Matungka R, Zheng YF, Ewing RL (2009) Image registration using adaptive polar transform. IEEE Trans Image Process 18:2340–2354

    Article  MathSciNet  MATH  Google Scholar 

  • Mitchell TM (1997) Machine learning. McGraw-Hill, New York

    MATH  Google Scholar 

  • Moussa SB, Zahour A, Benabdelhafid A, Alimi AM (2008) Fractal-based system for Arabic/Latin, printed/handwritten script identification. In: 2008 19th International Conference on Pattern Recognition. IEEE, pp 1–4

  • Namboodiri AM, Jain AK (2004) Online handwritten script recognition. IEEE Trans Pattern Anal Mach Intell 26:124–130

    Article  Google Scholar 

  • Narayanan VS, Kasthuri N (2020) An efficient recognition system for preserving ancient historical documents of English characters. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02201-w

  • Nigam S, Khare A (2012) Curvelet transform-based technique for tracking of moving objects. IET Comput Vision 6:231–251

    Article  MathSciNet  Google Scholar 

  • Obaidullah SM, Halder C, Das N, Roy K (2016) A new dataset of word-level offline handwritten numeral images from four official Indic scripts and its benchmarking using image transform fusion. Int J Intell Eng Inform 4:1–20

    Google Scholar 

  • Obaidullah S, Santosh K, Halder C, Das N, Roy K (2017) Word-level multi-script Indic document image dataset and baseline results on script identification. Int J Comput Vision Image Process (IJCVIP) 7:81–94

    Article  Google Scholar 

  • Obaidullah SM, Halder C, Santosh K, Das N, Roy K (2018) PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimedia Tools Appl 77:1643–1678

    Article  Google Scholar 

  • Obaidullah SM, Santosh K, Halder C, Das N, Roy K (2019) Automatic Indic script identification from handwritten documents: page, block, line and word-level approach. Int J Mach Learn Cybernet 10:87–106

    Article  Google Scholar 

  • Padma M, Vijaya P (2009) Monothetic separation of Telugu, Hindi and English text lines from a multi script document. In: 2009 IEEE International Conference on Systems, Man and Cybernetics. IEEE, pp 4870–4875

  • Pal U, Chaudhuri B (2002) Identification of different script lines from multi-script documents. Image Vision Comput 20:945–954

    Article  Google Scholar 

  • Pan W, Suen CY, Bui TD (2005) Script identification using steerable Gabor filters. In: Eighth International Conference on Document Analysis and Recognition (ICDAR'05). IEEE, pp 883–887

  • Pardeshi R, Chaudhuri B, Hangarge M, Santosh K (2014) Automatic handwritten Indian scripts identification. In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 375–380

  • Patil SB, Subbareddy N (2002) Neural network based system for script identification in Indian documents. Sadhana 27:83–97

    Article  Google Scholar 

  • Pati PB, Ramakrishnan A (2008) Word level multi-script identification. Pattern Recogn Lett 29:1218–1229

    Article  Google Scholar 

  • Poornachandra S, Ravichandran V, Kumaravel N (2003) Mapping of discrete cosine transform (DCT) and discrete sine transform (DST) based on symmetries. IETE J Res 49:35–42

    Article  Google Scholar 

  • Pun C-M, Lee M-C (2003) Log-polar wavelet energy signatures for rotation and scale invariant texture classification. IEEE Trans Pattern Anal Mach Intell 25:590–603

    Article  Google Scholar 

  • Sahare P, Dhok SB (2017a) Review of text extraction algorithms for scene-text and document images. IETE Tech Rev 34:144–164

    Article  Google Scholar 

  • Sahare P, Dhok SB (2017b) Script identification algorithms: a survey. Int J Multimedia Inf Retrieval 6:211–232

    Article  Google Scholar 

  • Sahare P, Dhok SB (2018a) Multilingual character segmentation and recognition schemes for Indian document images. IEEE Access 6:10603–10617

    Article  Google Scholar 

  • Sahare P, Dhok SB (2018b) Separation of handwritten and machine-printed texts from noisy documents using contourlet transform. Arab J Sci Eng 43:8159–8177

    Article  Google Scholar 

  • Sahare P, Dhok SB (2019a) Robust character segmentation and recognition schemes for multilingual Indian document Images. IETE Tech Rev 36:209–222

    Article  Google Scholar 

  • Sahare P, Dhok SB (2019b) Separation of machine-printed and handwritten texts in noisy documents using wavelet transform. IETE Tech Rev 36:341–361

    Article  Google Scholar 

  • Sahare P, Chaudhari RE, Dhok SB (2019) Word level multi-script identification using curvelet transform in log-polar domain. IETE J Res 65:410–432

    Article  Google Scholar 

  • Shijian L, Tan CL (2007) Script and language identification in noisy and degraded document images. IEEE Trans Pattern Anal Mach Intell 30:14–24

    Article  Google Scholar 

  • Shivakumara P, Yuan Z, Zhao D, Lu T, Tan CL (2015) New gradient-spatial-structural features for video script identification. Comput Vision Image Understand 130:35–53

    Article  Google Scholar 

  • Shi C-Z, Gao S, Liu M-T, Qi C-Z, Wang C-H, Xiao B-H (2015) Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans Image Process 24:4952–4964

    Article  MathSciNet  MATH  Google Scholar 

  • Shi B, Bai X, Yao C (2016) Script identification in the wild via discriminative convolutional neural network. Pattern Recogn 52:448–458

    Article  Google Scholar 

  • Singh PK, Dalal SK, Sarkar R, Nasipuri M (2015) Page-level script identification from multi-script handwritten documents. In: Proceedings of the 2015 Third International Conference on Computer, Communication, Control and Information Technology (C3IT). IEEE, pp 1–6

  • Singh PK, Sarkar R, Bhateja V, Nasipuri M (2018) A comprehensive handwritten Indic script recognition system: a tree-based approach. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1052-4

    Article  Google Scholar 

  • Soman K (2010) Insight into wavelets: from theory to practice. PHI Learning Pvt. Ltd., Delhi

    Google Scholar 

  • Spitz AL (1997) Determination of the script and language content of document images. IEEE Trans Pattern Anal Mach Intell 19:235–245

    Article  Google Scholar 

  • Vincent N, Bouletreau V, Emptoz H, Sabourin R (2000) How to use fractal dimensions to qualify writings and writers. Fractals 8:85–97

    Article  Google Scholar 

  • Zagoris K, Pratikakis I, Antonacopoulos A, Gatos B, Papamarkos N (2014) Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recogn 47:1051–1062

    Article  Google Scholar 

  • Zheng Y, Li H, Doermann D (2002) The segmentation and identification of handwriting in noisy document images. In: International Workshop on Document Analysis Systems. Springer, pp 95–105

  • Zhou J, Wang F, Xu J, Yan Y, Zhu H (2019) A novel character segmentation method for serial number on banknotes with complex background. J Ambient Intell Humaniz Comput 10:2955–2969

    Article  Google Scholar 

  • Zhu G, Yu X, Li Y, Doermann D (2009) Language identification for handwritten document images using a shape codebook. Pattern Recogn 42:3184–3191

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parul Sahare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sahare, P., Dhok, S.B. Script pattern identification of word images using multi-directional and multi-scalable textures. J Ambient Intell Human Comput 12, 9739–9755 (2021). https://doi.org/10.1007/s12652-020-02718-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02718-0

Keywords

Navigation