Skip to main content
Log in

A post-processor for Gurmukhi OCR

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

A post-processing system for OCR of Gurmukhi script has been developed. Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor. An improvement of 3% in recognition rate, from 94.35% to 97.34%, has been reported on clean images using the post-processing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bansal V, Sinha R M K 1999 Partitioning and searching dictionary for correction of optically read Devnagri character strings. InProceedings Fifth International Conference on Document Analysis and Recognition (IEEE Comput. Soc. Press) pp 653–656

  • Church K W, Gale W, Hank P, Hindle D 1990 Word association norms, mutual information and lexicography.Comput. Linguistics 16: 22–29

    Google Scholar 

  • Hong T 1995Degraded text recognition using visual and linguistic context. Ph D thesis, Faculty of Graduate School, State University of New York, Buffalo, NY

    Google Scholar 

  • Hull J J, Srihari S N 1982 Experiments in text recognition with binary n-gram and Viterbi algorithm.IEEE Trans. Pattern Anal. Machine Intell. 4: 520–530

    Google Scholar 

  • Lehal G S, Chandan Singh 1999 Feature extraction and classification for OCR of Gurmukhi script.Vivek 12: 2–12

    Google Scholar 

  • Lehal G S, Chandan Singh 2000 A Gurmukhi script recognition system. InProceedings 15th International Conference on Pattern Recogniton, Barcelona, Spain, vol 2, pp 557–560

  • Mayes E, Dameran F J, Mercer R L 1991 Context based spelling correction.Inf. Process. Manage. 27: 517–522

    Article  Google Scholar 

  • Riseman E M, Hanson A R 1974 A contextual postprocessing system for error correction using binary n-grams.IEEE Trans. Comput. C-23: 480–93

    Article  Google Scholar 

  • Sinha R M K 1987 Rule based contextual post-processing for Devanagri text recognition.Pattern Recog. 20: 475–85

    Article  Google Scholar 

  • Suen C Y 1979 N-gram statistics for natural language understanding and text processing.IEEE Trans. Pattern Anal. Machine Intell. 1: 164–172

    Article  Google Scholar 

  • Tong X, Evans D A 1996 A statistical approach to automatic OCR error correction in context.Proceedings of the 4th Workshop on Very Large Corpora, pp. 88–100

  • Wells C J, Evett L J, Whitby P E, Whitrow R J 1990 Fast dictionary lookup for contextual word recognition.Pattern Recogn. 23: 501–508

    Article  Google Scholar 

  • Yannakoudakis E J, Tsomokos I, Hutton P J 1990 N-grams and their implication to natural language understanding.Pattern Recogn. 23: 509–528

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lehal, G.S., Singh, C. A post-processor for Gurmukhi OCR. Sadhana 27, 99–111 (2002). https://doi.org/10.1007/BF02703315

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02703315

Keywords

Navigation