Skip to main content

A Sliding Window Framework for Word Spotting Based on Word Attributes

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9117))

Included in the following conference series:

Abstract

In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. IJPRAI 15, 65–90 (2001)

    Google Scholar 

  2. Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. PAMI 26, 709–720 (2004)

    Article  Google Scholar 

  3. Rodríguez-Serrano, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained hand-written documents. In: International Conference on Frontiers in Handwriting Recognition (2008)

    Google Scholar 

  4. Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211–224 (2012)

    Article  Google Scholar 

  5. Leydier, Y., Ouji, A., Lebourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recogn. 42, 2089–2105 (2009)

    Article  MATH  Google Scholar 

  6. Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: International Conference on Document Analysis and Recognition, pp. 827–831 (2013)

    Google Scholar 

  7. Rusiñol, M., Aldavert, D., Toledo, R., Lladós, J.: Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International Conference on Document Analysis and Recognition, pp. 63–67 (2011)

    Google Scholar 

  8. Rothacker, L., Rusiñol, M., Fink, G.A.: Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1305–1309 (2013)

    Google Scholar 

  9. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004)

    Google Scholar 

  10. Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recogn. 47, 3967–3978 (2014)

    Article  Google Scholar 

  11. Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. PAMI 36, 2552–2566 (2014)

    Article  Google Scholar 

  12. Rath, T., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139–152 (2007)

    Article  Google Scholar 

  13. Marti, U.V., Bunke, H.: The IAM-database: an english sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)

    Article  MATH  Google Scholar 

  14. Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: a string kernel for SVM protein classification. In: Pacific Symposium on Biocomputing (2002)

    Google Scholar 

  15. Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLR 2, 419–444 (2002)

    MATH  Google Scholar 

  16. Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: International Conference on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  17. Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2911–2918 (2012)

    Google Scholar 

  18. Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)

    Google Scholar 

  20. Kovalchuk, A., Wolf, L., Dershowitz, N.: A simple and fast word spotting method. In: International Conference on Frontiers in Handwriting Recognition (2014)

    Google Scholar 

  21. Fischer, A., Keller, A., Frinken, V., Bunke, H.: HMM-based word spotting in handwritten documents using subword models. In: International Conference on Pattern Recognition (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suman K. Ghosh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ghosh, S.K., Valveny, E. (2015). A Sliding Window Framework for Word Spotting Based on Word Attributes. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19390-8_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19389-2

  • Online ISBN: 978-3-319-19390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics