Skip to main content

SCEHMA: Speech Corpus of English, Hindi, Marathi and Arabic Language for Advance Speech Recognition Development

  • Conference paper
  • First Online:
Applied Computing to Support Industry: Innovation and Technology (ACRIT 2019)

Abstract

The database is an essential key element for speech recognition research. This research describes the development of the SCEHMA speech database dedicated to advance speech recognition applications in Hindi, English, Marathi and Arabic languages. The SCEHMA corpus is a collection of isolated word and continuous sentences of speech. For the application domain of agriculture, polyclinic and general-purpose speech recognition in Marathi language 28420 isolated words and 17470 sentences are collected from 300 male and 200 female subjects of 22–30 age groups. The corpus consists of 900 sentences in the Hindi language for accent recognition domain collected from 18 male and 12 female of 18–30 age groups. The English speech corpus was collected from 22–30 age groups of 750 isolated words and 750 sentences from 12 male and 3 female of age group 22–30 for the general domain. The Arabic speech corpus contains 4520 words and 40 sentences from 12 male and 9 female of 18–30 age groups for recognition domain. To achieve a high quality of speech corpus, the recording took place in 10 by 10 office room without a noisy sound environment. The speech utterances were recorded in 16 kHz in three recordings medium, a headset, desktop mounted microphone and Mobile phone. The data was recorded in the morning, and evening session in the room temperature and normal humidity. Speaker was asked to sit in front of the microphone with a distance of about 12–15 cm. The database is collected as per LDCIL protocol and the corpus is transcript through Google Unicode editor. Praat is used for corpus labeling and annotation. The total size of the SCEHMA corpus is 33690 isolated words and 19160 continuous sentences. The corpus will be made available to the scientific community for agricultural, polyclinic, medical, accent recognition, age group identification, gender recognition, and general-purpose recognition system after the transcription and annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schultz, T., Weibel, A.: Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35(1–2), 31–51 (2001)

    Article  Google Scholar 

  2. Murthy, H.A., et al.: Building unit selection speech synthesis in Indian languages: an initiative by an Indian consortium. In: Proceedings of COCOSDA, Kathmandu, Nepal (2010)

    Google Scholar 

  3. English Language [Online]. http://en.wikipedia.org/wiki/English_language. Accessed Dec 2018

  4. Hindi Language [Online]. http://en.wikipedia.org/wiki/Hindustani_phonology. Accessed Dec 2018

  5. Singh, S.P., et al.: Building large vocabulary speech recognition systems for Indian languages. In: International Conference on Natural Language Processing, pp. 245–254 (2004)

    Google Scholar 

  6. Marathi CIIL corpus [source]. http://tdil.mit.gov.in/corpora/achcorpora

  7. Gaikwad, S., Gawali, B., Mehrotra, S.: Creation of Marathi speech corpus for automatic speech recognition. In: International Conference Oriental COCOSDA held Jointly with the 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE) (2013)

    Google Scholar 

  8. Al-Zabibi, M.: An acoustic–phonetic approach in automatic Arabic speech recognition. The British Library in Association with UMI (1990)

    Google Scholar 

  9. Gaikwad, S.K., Gawali, B., Yannawar, P.: A review on speech recognition technique. Int. J. Comput. Appl. 10, 16–24 (2010)

    Google Scholar 

  10. Godin, C., Lockwood, P.: DTW schemes for continuous speech recognition: a unified view. Comput. Speech Lang. 3(2), 169–198 (1989)

    Article  Google Scholar 

  11. Yuliani, A.R., Sustika, R., Yuwana, R.S., Pardede, H.F.: Feature transformations for robust speech recognition in reverberant conditions. In: 2017 International Conference on Computer Control Informatics and its Applications (IC3INA), pp. 57–62 (2017)

    Google Scholar 

  12. Prasetio, M.D., Hayashida, T., Nishizaki, I., Sekizaki, S.: Structural optimization of deep belief network theorem for classification in speech recognition. In: 2017 IEEE 10th International Workshop on Computational Intelligence and Applications (IWCIA), pp. 121–128 (2017)

    Google Scholar 

  13. Khara, S., Singh, S., Vir, D.: A comparative study of the techniques for feature extraction and classification in stuttering. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 887–893 (2018)

    Google Scholar 

  14. Gawali, B.W., Gaikwad, S., Yannawar, P., Mehrotra, S.C.: Marathi isolated word recognition system uses MFCC and DTW features. ACEE Int. J. Inf. Technol. 1(1), 21–24 (2011)

    Google Scholar 

  15. Praat Tutorial [Online]. http://www.stanford.edu/dept/linguistics/corpora/material/PRAAT_workshop_manual_v421.pdf

  16. Praat Tutorial [online] source. http://www.fon.hum.uva.nl/praat

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Gaikwad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gaikwad, S., Gawali, B., Basil, M. (2020). SCEHMA: Speech Corpus of English, Hindi, Marathi and Arabic Language for Advance Speech Recognition Development. In: Khalaf, M., Al-Jumeily, D., Lisitsa, A. (eds) Applied Computing to Support Industry: Innovation and Technology. ACRIT 2019. Communications in Computer and Information Science, vol 1174. Springer, Cham. https://doi.org/10.1007/978-3-030-38752-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-38752-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-38751-8

  • Online ISBN: 978-3-030-38752-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics