Skip to main content

Spoken Arabic Digits Recognition System Using Convolutional Neural Network

  • Conference paper
  • First Online:
Advanced Machine Learning Technologies and Applications (AMLTA 2021)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1339))

Abstract

Digit recognition has a vital use in multiple human-machine interaction applications. It is used in telephone-based services, such as dialing systems, airline reservation systems, different bank transactions, and price extraction. This research aims to develop a new Convolution Neural Network (CNN) based spoken digits recognition system for the Arabic digits. The developed system used a classification approach to perform the recognition task. First, the Mel frequency cepstral coefficients of the spoken digits were conducted and reduced in the convolution phase. Then in the classification phase, the most appropriate digit label for the testing utterances is produced. The proposed approach has shown a remarkable performance when compared to similar systems. The recognition system achieved a 99% correct digit recognition compared to 98% using Recurrent Neural Networks based digit recognition system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://web.archive.org/web/20190907141114/, https://www.internetworldstats.com/stats7.htm

  2. Yin, W., et al.: Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923 (2017)

  3. Alotaibi, Y.A.: Investigating spoken Arabic digits in speech recognition setting. Inf. Sci. 173(1–3), 115–139 (2005)

    Article  Google Scholar 

  4. Alotaibi, Y.A.: Comparative study of ANN and HMM to Arabic digits recognition systems. Eng. Sci. 19(1) (2008)

    Google Scholar 

  5. Satori, H., Harti, M., Chenfour, N.: Introduction to Arabic speech recognition using CMUSphinx system. arXiv preprint arXiv:0704.2083 (2007)

  6. Daqrouq, K., et al.: Wavelet LPC with neural network for spoken Arabic digits recognition system. Curr. J. Appl. Sci. Technol. 4, 1238–1255 (2014)

    Article  Google Scholar 

  7. Hammami, N., Sellam, M.: Tree distribution classifier for automatic spoken Arabic digit recognition. In: 2009 International Conference for Internet Technology and Secured Transactions, (ICITST). IEEE (2009)

    Google Scholar 

  8. Hammami, N., Bedda, M.: Improved tree model for arabic speech recognition. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 5. IEEE (2010)

    Google Scholar 

  9. Alotaibi, Y.A.: Spoken Arabic digits recognizer using recurrent neural networks. In: Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology 2004. IEEE (2004)

    Google Scholar 

  10. Zerari, N., et al.: Bi-directional recurrent end-to-end neural network classifier for spoken Arab digit recognition. In: 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP). IEEE (2018)

    Google Scholar 

  11. Zada, B., Ullah, R.: Pashto isolated digits recognition using deep convolutional neural network. Heliyon 6(2), e03372 (2020)

    Article  Google Scholar 

  12. Sharmin, R., Rahut, S.K., Huq, M.R.: Bengali spoken digit classification: a deep learning approach using convolutional neural network. Procedia Comput. Sci. 171, 1381–1388 (2020)

    Article  Google Scholar 

  13. Dalsaniya, N., Mankad, S.H., Garg, S., Shrivastava, D.: Development of a novel database in Gujarati language for spoken digits classification. In: International Symposium on Signal Processing and Intelligent Recognition Systems, pp. 208–219. Springer, Singapore, December 2019

    Google Scholar 

  14. Palaz, D., Magimai.-Doss, M., Collobert, R.: Convolutional neural networks-based continuous speech recognition using raw speech signal. In: Proceedings of ICASSP, April 2015

    Google Scholar 

  15. Sainath, T.N., Mohamed, A., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, pp. 8614–8618 (2013). https://doi.org/10.1109/ICASSP.2013.6639347

  16. Abdel-Hamid, O., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)

    Article  Google Scholar 

  17. Palaz, D., Magimai.-Doss, M., Collobert, R.: Analysis of CNN-based speech recognition system using raw speech as input (2015)

    Google Scholar 

  18. Dua, D., Graff, C.: UCI Machine Learning Repositor. University of California, School of Information and Computer Science, Irvine, CA (2019). https://archive.ics.uci.edu/ml

  19. Jiang, H.: Confidence measures for speech recognition: a survey. Speech Commun. 45(4), 455–470 (2005)

    Article  Google Scholar 

  20. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 ( 2014)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mona A. Azim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Azim, M.A., Hussein, W., Badr, N.L. (2021). Spoken Arabic Digits Recognition System Using Convolutional Neural Network. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_17

Download citation

Publish with us

Policies and ethics