Abstract
Research on Arabic handwriting recognition has been seriously challenged due to its cursive appearance, the variety of writers and the diversity of styles. In fact, motivated by a series of success cases in computer vision, we try to explore the Maxout units in Multidirectional neural networks for the offline task. Therefore, in this work, we model an Arabic handwritten word with a deep MDLSTM-based system. This architecture can directly work on raw input images since it enables us to model the script variations on both axes of the image due to recurrence over them. However, several problems, such as the vanishing gradient, can affect the training of this recognition system. To overcome this problem, we should integrate Maxout units into MDLSTM system in order to enhance it and improve its performance. In this context, different integration modes are carried out to draw out the best topology. Proposed systems are evaluated on a large database IFN/ENIT. According to the experimental results and compared to the baseline system, the best tested architecture reduced the label error rate by 6.86%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Bluche, T., Louradour, J.J., Messina, R.: Scan, attend and read: end-to-end handwritten paragraph recognition with mdlstm attention. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 1050–1055. IEEE (2017)
Graves, A., Schmidhuber, J.A.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 545–552 (2009)
Pechwitz, M., Maddouri, S.S., Märgner, V., et al.: IFN/ENIT-database of handwritten Arabic words. In: Proceedings of CIFED. Citeseer, pp. 127–136 (2002)
Strauß, T., Grüning, T., Leifert, G., Labahn, R.: Citlab ARGUS for historical handwritten documents (2014). arXiv Prepr arXiv:14123949
El, H., Volker Märgner, A.: ICDAR 2009-Arabic handwriting recognition competition, pp. 14:3–13 (2011)
Maalej, R., Tagougui, N., Kherallah, M.: Recognition of handwritten Arabic words with dropout applied in MDLSTM. In: Campilho, A., Karray, F. (eds.) ICIAR 2016. LNCS, vol. 9730, pp. 746–752. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41501-7_83
Maalej, R., Kherallah, M.: Improving MDLSTM for offline Arabic handwriting recognition using dropout at different positions. In: Villa, A., Masulli, P., Pons Rivero, A. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 431–438. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_51
Graves, A., Fernández, S., Gomez, F., et al.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp 369–376. ACM (2006)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp 1310–1318 (2013)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994)
Goodfellow, I.J., Warde-Farley, D., Mirza, M., et al.: Maxout Networks, arXiv preprint arXiv:1302.4389 (2013)
Li, X., Wu, X.: Improving long short-term memory networks using maxout units for large vocabulary speech recognition. In: ICASSP 2015, pp. 4600–4604 (2015)
Swietojanski, P., Li, J., Huang, J.-T.: Investigation of maxout networks for speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7649–7653 (2014)
Miao, Y., Metze, F., Rawat, S., et al.: Deep maxout networks for low-resource speech recognition. In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 398–403 (2013)
Zhang, X., Trmal, J., Povey, D., et al.: Improving deep neural network acoustic models using generalized maxout networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 215–219 (2014)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp 315–323 (2011)
Cai, M., Liu, J.: Maxout neurons for deep convolutional and LSTM neural networks in speech recognition. Speech Commun. 77, 53–64 (2016)
Maalej, R., Kherallah, M.: Improving the DBLSTM for on-line Arabic handwriting recognition. Multimed. Tools Appl. (in press)
Jayech, K., Mahjoub, M., Amara, N.B.: Arabic handwritten word recognition based on dynamic Bayesian network. Int. Arab J. Inf. Technol. 13(6B), 1024–1031 (2016)
Amrouch, M., Rabi, M., Es-Saady, Y.: Convolutional feature learning and CNN based HMM for Arabic handwriting recognition. In: Mansouri, A., El Moataz, A., Nouboud, F., Mammass, D. (eds.) ICISP 2018. LNCS, vol. 10884, pp. 265–274. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94211-7_29
Maalej, R., Kherallah, M.: Convolutional neural network and BLSTM for offline Arabic handwriting recognition. In: IEEE International Arab Conference on Information Technology (ACIT), pp. 1–6 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Maalej, R., Kherallah, M. (2019). Maxout into MDLSTM for Offline Arabic Handwriting Recognition. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Lecture Notes in Computer Science(), vol 11955. Springer, Cham. https://doi.org/10.1007/978-3-030-36718-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-36718-3_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36717-6
Online ISBN: 978-3-030-36718-3
eBook Packages: Computer ScienceComputer Science (R0)