Tandem hidden Markov models using deep belief networks for offline handwriting recognition

Roy, Partha Pratim; Zhong, Guoqiang; Cheriet, Mohamed

doi:10.1631/FITEE.1600996

Tandem hidden Markov models using deep belief networks for offline handwriting recognition

Published: 08 August 2017

Volume 18, pages 978–988, (2017)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

172 Accesses
11 Citations
Explore all metrics

Abstract

Unconstrained offline handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document images, much effort has been made to integrate multi-layer perceptrons (MLPs) in either a hybrid or a tandem fashion into hidden Markov models (HMMs). However, due to the weak learnability of MLPs, the learnt features are not necessarily optimal for subsequent recognition tasks. In this paper, we propose a deep architecture-based tandem approach for unconstrained offline handwriting recognition. In the proposed model, deep belief networks are adopted to learn the compact representations of sequential data, while HMMs are applied for (sub-)word recognition. We evaluate the proposed model on two publicly available datasets, i.e., RIMES and IFN/ENIT, which are based on Latin and Arabic languages respectively, and one dataset collected by ourselves called Devanagari (an Indian script). Extensive experiments show the advantage of the proposed model, especially over the MLP-HMMs tandem approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Autoencoders and their applications in machine learning: a survey

Article Open access 03 February 2024

Biometrics recognition using deep learning: a survey

Article 13 January 2023

References

Augustin, E., Carré, M., Grosicki, E., et al., 2006. RIMES evaluation campaign for handwritten mail processing. Proc. Int. Workshop on Frontiers in Handwriting Recognition, p.231–235.
Google Scholar
Baum, L.E., Petrie, T., Soules, G., et al., 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist., 41(1): 164–171.
Article MathSciNet MATH Google Scholar
Bertolami, R., Bunke, H., 2008. Hidden Markov modelbased ensemble methods for offline handwritten text line recognition. Patt. Recog., 41(11): 3452–3460. http://dx.doi.org/10.1016/j.patcog.2008.04.003
Article MATH Google Scholar
Bianne-Bernard, A.L., Menasri, F., Mohamad, R.A.H., et al., 2011. Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE Trans. Patt. Anal. Mach. Intell., 33(10): 2066–2080. http://dx.doi.org/10.1109/TPAMI.2011.22
Article Google Scholar
Bourlard, H.A., Morgan, N., 1994. Connectionist Speech Recognition: a Hybrid Approach. Springer US, USA.
Book Google Scholar
Bunke, H., 2003. Recognition of cursive Roman handwriting: past, present and future. Proc. 7th Int. Conf. on Document Analysis and Recognition, p.448–459. http://dx.doi.org/10.1109/ICDAR.2003.1227707
Google Scholar
Dahl, G., Yu, D., Deng, L., et al., 2011. Large vocabulary continuous speech recognition with context-dependent DBN-HMMs. Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.4688–4691.
Google Scholar
Deselaers, T., Hasan, S., Bender, O., et al., 2009. A deep learning approach to machine transliteration. Proc. 4th Workshop on Statistical Machine Translation, p.233–241.
Google Scholar
Dreuw, P., Heigold, G., Ney, H., 2009. Confidence-based discriminative training for model adaptation in offline Arabic handwriting recognition. Proc. 10th Int. Conf. on Document Analysis and Recognition, p.596–600. http://dx.doi.org/10.1109/ICDAR.2009.116
Google Scholar
Dreuw, P., Doetsch, P., Plahl, C., et al., 2011a. Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained Gaussian HMM: a comparison for offline handwriting recognition. Proc. 18th Int. Conf. on Image Processing, p.3541–3544. http://dx.doi.org/10.1109/ICIP.2011.6116480
Google Scholar
Dreuw, P., Heigold, G., Ney, H., 2011b. Confidence-and margin-based MMI/MPE discriminative training for offline handwriting recognition. Int. J. Doc. Anal. Recog., 14: 273–288. http://dx.doi.org/10.1007/s10032-011-0160-x
Article Google Scholar
El-Yacoubi, A., Gilloux, M., Sabourin, R., et al., 1999. An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE Trans. Patt. Anal. Mach. Intell., 21(8): 752–760. http://dx.doi.org/10.1109/34.784288
Article Google Scholar
Espana-Boquera, S., Castro-Bleda, M.J., Gorbe-Moya, J., et al., 2011. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Patt. Anal. Mach. Intell., 33(4): 767–779. http://dx.doi.org/10.1109/TPAMI.2010.141
Article Google Scholar
Fujisawa, H., 2008. Forty years of research in character and document recognition—an industrial perspective. Patt. Recog., 41: 2435–2446. http://dx.doi.org/10.1016/j.patcog.2008.03.015
Article Google Scholar
Graves, A., Schmidhuber, J., 2008. Offline handwriting recognition with multidimensional recurrent neural networks. Proc. 21st Int. Conf. on Neural Information Processing Systems, p.545–552.
Google Scholar
Graves, A., Liwicki, M., Fernández, S., et al., 2009. A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Patt. Anal. Mach. Intell., 31(5): 855–868. http://dx.doi.org/10.1109/TPAMI.2008.137
Article Google Scholar
Grosicki, E., El Abed, H., 2009. ICDAR 2009 handwriting recognition competition. Proc. 10th Int. Conf. on Document Analysis and Recognition, p.1398–1402. http://dx.doi.org/10.1109/ICDAR.2009.184
Google Scholar
Haykin, S., 1998. Neural Networks: a Comprehensive Foundation. Prentice Hall, USA.
MATH Google Scholar
Hermansky, H., Ellis, D.P.W., Sharma, S., 2000. Tandem connectionist feature extraction for conventional HMM systems. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1–4. http://dx.doi.org/10.1109/ICASSP.2000.862024
Google Scholar
Hinton, G.E., 2002. Training products of experts by minimizing contrastive divergence. Neur. Comput., 14(8): 1771–1800. http://dx.doi.org/10.1162/089976602760128018
Article MATH Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W., 2006. A fast learning algorithm for deep belief nets. Neur. Comput., 18(7): 1527–1554. http://dx.doi.org/10.1162/neco.2006.18.7.1527
Article MathSciNet MATH Google Scholar
Kessentini, Y., Paquet, T., Benhamadou, A., 2008. A multistream HMM-based approach for off-line multi-script handwritten word recognition. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.1–6.
Google Scholar
Kittler, J., Young, P.C., 1973. A new approach to feature selection based on the Karhunen-Loeve expansion. Patt. Recog., 5(4): 335–352. http://dx.doi.org/10.1016/0031-3203(73)90025-3
Article MathSciNet Google Scholar
Kozielski, M., Doetsch, P., Ney, H., 2013. Improvements in RWTH’s system for off-line handwriting recognition. Proc. 12th Int. Conf. on Document Analysis and Recognition, p.935–939. http://dx.doi.org/10.1109/ICDAR.2013.190
Google Scholar
Margner, V., El Abed, H., 2010. ICFHR 2010—Arabic handwriting recognition competition. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.709–714. http://dx.doi.org/10.1109/ICFHR.2010.115
Google Scholar
Marinai, S., Gori, M., Soda, G., 2005. Artificial neural networks for document analysis and recognition. IEEE Trans. Patt. Anal. Mach. Intell., 27(1): 23–35. http://dx.doi.org/10.1109/TPAMI.2005.4
Article Google Scholar
Marti, U.V., Bunke, H., 2001. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Patt. Recog. Artif. Intell., 15(1): 65–90. http://dx.doi.org/10.1142/S0218001401000848
Article Google Scholar
Mohamad, R.A.H., Likforman-Sulem, L., Mokbel, C., 2009. Combining slanted-frame classifiers for improved HMMbased Arabic handwriting recognition. IEEE Trans. Patt. Anal. Mach. Intell., 31(7): 1165–1177. http://dx.doi.org/10.1109/TPAMI.2008.136
Article Google Scholar
Mohamed, A.R., Dahl, G., Hinton, G., 2009. Deep belief networks for phone recognition. Proc. NIPS Workshop on Deep Learning for Speech Recognition and Related Applications, p.1–9.
Google Scholar
Mohamed, A.R., Dahl, G., Hinton, G., 2012. Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process., 20(1): 14–22. http://dx.doi.org/10.1109/TASL.2011.2109382
Article Google Scholar
Otsu, N., 1979. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern., 9(1): 62–66. http://dx.doi.org/10.1109/TSMC.1979.4310076
Article Google Scholar
Pal, U., Chaudhuri, B.B., 2004. Indian script character recognition: a survey. Patt. Recog., 37(9): 1887–1899. http://dx.doi.org/10.1016/j.patcog.2004.02.003
Article Google Scholar
Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77(2): 257–286. http://dx.doi.org/10.1109/5.18626
Article Google Scholar
Renals, S., Morgan, N., Bourlard, H., et al., 1994. Connectionist probability estimators in HMM speech recognition. IEEE Trans. Speech Audio Process., 2(1): 161–174. http://dx.doi.org/10.1109/89.260359
Article Google Scholar
Rodríguez, J.A., Perronnin, F., 2008. Local gradient histogram features for word spotting in unconstrained handwritten documents. Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.7–12.
Google Scholar
Schenk, J., Rigoll, G., 2006. Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. Proc. 10th Int. Workshop on Frontiers in Handwriting Recognition, p.1–5.
Google Scholar
Senior, A., Robinson, A.J., 1998. An off-line cursive handwriting recognition system. IEEE Trans. Patt. Anal. Mach. Intell., 20(3): 309–321. http://dx.doi.org/10.1109/34.667887
Article Google Scholar
Senior, A., Heigold, G., Bacchiani, M., et al., 2014. GMMfree DNN training. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, p.1–5.
Google Scholar
Sharma, S., Ellis, D., Kajarekar, S., et al., 2000. Feature extraction using non-linear transformation for robust speech recognition on the Aurora database. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1117–1120. http://dx.doi.org/10.1109/ICASSP.2000.859160
Google Scholar
Shaw, B., Bhattacharya, U., Parui, S.K., 2014. Combination of features for efficient recognition of offline handwritten Devanagari words. Proc. 14th Int. Conf. on Frontiers in Handwriting Recognition, p.240–245. http://dx.doi.org/10.1109/ICFHR.2014.48
Google Scholar
Thomas, S., Chatelain, C., Heutte, L., et al., 2015. A deep HMM model for multiple keywords spotting in handwritten documents. Patt. Anal. Appl., 18(4): 1003–1015. http://dx.doi.org/10.1007/s10044-014-0433-3
Article MathSciNet Google Scholar
Vinciarelli, A., 2002. A survey on off-line cursive word recognition. Patt. Recog., 35(7): 1433–1446. http://dx.doi.org/10.1016/S0031-3203(01)00129-7
Article MATH Google Scholar
Vinciarelli, A., Bengio, S., Bunke, H., 2004. Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. Patt. Anal. Mach. Intell., 26(6): 709–720. http://dx.doi.org/10.1109/TPAMI.2004.14
Article Google Scholar
Young, S., Evermann, G., Gales, M.J.F., 2006. The HTK Book (Version 3.4). Engineering Department, Cambridge University, UK.
Google Scholar
Zimmermann, M., Chappelier, J.C., Bunke, H., 2006. Offline grammar-based recognition of handwritten sentences. IEEE Trans. Patt. Anal. Mach. Intell., 28(5): 818–821. http://dx.doi.org/10.1109/TPAMI.2006.103
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Indian Institute of Technology Roorkee, Roorkee, 247667, India
Partha Pratim Roy
Department of Computer Science and Technology, Ocean University of China, Qingdao, 266100, China
Guoqiang Zhong
Synchromedia Laboratory, École de Technologie Supérieure, Montreal, H3C 1K3, Canada
Mohamed Cheriet

Authors

Partha Pratim Roy
View author publications
You can also search for this author in PubMed Google Scholar
Guoqiang Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Cheriet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Partha Pratim Roy.

Additional information

Project supported by the National Natural Science Foundation of China (No. 61403353)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roy, P.P., Zhong, G. & Cheriet, M. Tandem hidden Markov models using deep belief networks for offline handwriting recognition. Frontiers Inf Technol Electronic Eng 18, 978–988 (2017). https://doi.org/10.1631/FITEE.1600996

Download citation

Received: 15 February 2016
Accepted: 24 June 2016
Published: 08 August 2017
Issue Date: July 2017
DOI: https://doi.org/10.1631/FITEE.1600996

Key words

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tandem hidden Markov models using deep belief networks for offline handwriting recognition

Abstract

Access this article

Similar content being viewed by others

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Autoencoders and their applications in machine learning: a survey

Biometrics recognition using deep learning: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Tandem hidden Markov models using deep belief networks for offline handwriting recognition

Abstract

Access this article

Similar content being viewed by others

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Autoencoders and their applications in machine learning: a survey

Biometrics recognition using deep learning: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation