Skip to main content

Combination of N-Grams and Stochastic Context-Free Grammars in an Offline Handwritten Recognition System

  • Conference paper
Pattern Recognition and Image Analysis (IbPRIA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4477))

Included in the following conference series:

Abstract

One area of pattern recognition that is receiving a lot of attention recently is handwritten text recognition. Traditionally, handwritten text recognition systems have been modelled by means of HMM models and n-gram language models. The problem that n-grams present is that they are not able to capture long-term constraints of the sentences. Stochastic context-free grammars (SCFG) can be used to overcome this limitation by rescoring a n-best list generated with the HMM-based recognizer. Howerver, SCFG are known to have problems in the estimation of comlpex real tasks. In this work we propose the use of a combination of n-grams and category-based SCFG together with a word distribution into categories. The category-based approach is thought to simplify the SCFG inference process, while at the same time preserving the description power of the model. The results on the IAM-Database show that this combined scheme outperforms the classical scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bozinovic, R.M., Srihari, S.N.: Off-line cursive script word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 11(1), 68–83 (1989)

    Article  Google Scholar 

  2. González, J., Salvador, I., Toselli, A.H., Juan, A., Vidal, E., Casacuberta, F.: Offline recognition of syntax-constrained cursive handwritten text. In: Amin, A., Pudil, P., Ferri, F.J., Iñesta, J.M. (eds.) SPR 2000 and SSPR 2000. LNCS, vol. 1876, pp. 143–153. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Yacoubi, A.E., Bertille, J.M., Gilloux, M.: Conjoined location and recognition of street names within a postal address delivery line. In: ICDAR ’95, Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 2, p. 1024. IEEE Computer Society Press, Washington (1995)

    Chapter  Google Scholar 

  4. Dimauro, G., Impedovo, S.P., Salzo, G.: Automatic banckcheck processing: A new engineered system. International Journal of Pattern Recognition and Artificial Intelligence 11(4), 467–504 (1997)

    Article  Google Scholar 

  5. Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. In: Readings in speech recognition, pp. 308–319 (1990)

    Google Scholar 

  6. Benedí, J., Sánchez, J.: Estimation of stochastic context-free grammars and their use as language models. Computer Speech and Language 19(3), 249–274 (2005)

    Article  Google Scholar 

  7. Zimmermann, M., Chappelier, J.C.: Offline grammar-based recognition of handwritten sentences. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818–821 (2006)

    Article  Google Scholar 

  8. Bose, C.B., Kuo, S.S.: Connected and degraded text recognition using hidden markov model. Pattern Recognition 27(10), 1345–1363 (1994)

    Article  Google Scholar 

  9. Ogawa, A., Takeda, K., Itakura, F.: Balancing acoustic and linguistic probabilities. In: ICASSP, vol. 1, pp. 181–184 (1998)

    Google Scholar 

  10. Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated Handwriting Recognition and Interpretation using Finite-State Models. Int. Journal of Pattern Recognition and Artificial Intelligence 18(4), 519–539 (2004)

    Article  Google Scholar 

  11. Gatos, B., Papamarkos, N., Chamzas, C.: Skew detection and text line position determination in digitized documents. Pattern Recognition 30(9), 1505–1519 (1997)

    Article  Google Scholar 

  12. Pastor, M., Toselli, A.H., Romero, V., Vidal, E.: Improving handwritten off-line text slant correction. In: Proc. of The Sixth IASTED international Conference on Visualization, Imaging, and Image Processing (VIIP 06), Palma de Mallorca, Spain (2006)

    Google Scholar 

  13. Romero, V., Pastor, M., Toselli, A.H., Vidal, E.: Criteria for handwritten off-line text size normalization. In: Proc. of The Sixth IASTED international Conference on Visualization, Imaging, and Image Processing (VIIP 06), Palma de Mallorca, Spain (2006)

    Google Scholar 

  14. Marti, U.V., Bunke, H.: The iam-database: an english sentence database for off-line handwriting recognition. Int. Journal on Document Analysis and Recognition 5, 39–46 (2002)

    Article  MATH  Google Scholar 

  15. Johansson, S., Leech, G.N., Goodluck, H.: Manual of Information to Accompany the Lancadster-Oslo/bergen Corpus of British English, for Use with Digital Computers. Dept. of Englis, Univ. of Oslo, Norway (1978)

    Google Scholar 

  16. Johansson, S., Atwell, E., Garside, R., Leech, G.: The Tagged LOB Corpus, User’s Manual. Bergen, Norway: Norwegian Computing Center for the Humanities (1986)

    Google Scholar 

  17. Garsid, R., Leech, G., Váradi, T.: Manual of Information for the Lancaster Parsed Corpus. Bergen, Norway: Norwegian Computing Center for the Humanities (1995)

    Google Scholar 

  18. Charniak, E.: http://www.cs.brown.edu/people/ec/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joan Martí José Miguel Benedí Ana Maria Mendonça Joan Serrat

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Romero, V., Alabau, V., Benedí, J.M. (2007). Combination of N-Grams and Stochastic Context-Free Grammars in an Offline Handwritten Recognition System. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72847-4_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72846-7

  • Online ISBN: 978-3-540-72847-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics