Skip to main content

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 222))

Abstract

Present paper describes the real time challenges to design the telephonic Automatic Speech Recognition (ASR) System. Telephonic speech data are collected automatically from all geographical regions of West Bengal to cover major dialectal variations of Bangla spoken language. All incoming calls are handled by Asterisk Server i.e. Computer telephony interface (CTI). The system asks some queries and users’ spoken responses are stored and transcribed manually for ASR system training. At the time of application of telephonic ASR, users’ voice queries are passed through the Signal Analysis and Decision (SAD) Module and after getting its decision speech signal may enter into the back-end Automatic Speech Recognition (ASR) Engine and relevant information is automatically delivered to the user. In real time scenario, the telephonic speech contains channel drop, silence or no speech event, truncated speech signal, noisy signal etc. along with the desired speech event. This paper deals with some techniques which will handle such unwanted signals in case of telephonic speech to certain extent and able to provide almost desired speech signal for the ASR system. Real time telephonic ASR system performance is increased by 8.91 % after implementing SAD module.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lee K-M, Lai J (2005) Speech vs. touch: a comparative study of the use of speech and DTMF keypad for navigation. International Journal of Human Computer Interaction IJHCI, vol 19(3)

    Google Scholar 

  2. Furui S (2000) Speech recognition technology in the ubiquitous/wearable computing environment. In: Proceedings of the international conference on acoustics speech and signal processing, pp 3735–3738

    Google Scholar 

  3. Maes SH, Chazan D, Cohen G, Hoory R (2000) Conversational networking: conversational protocols for transport, coding, and control. In: Proceedings of the international conference on spoken language processing

    Google Scholar 

  4. Gomillion D, Dempster B Building telephony system with asterisk. ISBN: 1-904811-15-9, Packet Publishing Ltd

    Google Scholar 

  5. Meggelen JV, Madsen L, Smith J Asterisk: the future of telephony, ISBN-10: 0-596-51048-9, ISBN-13: 987-0-596-51048-0, O’REILL

    Google Scholar 

  6. http://www.speech.cs.cmu.edu/

  7. Basu J, Khan S, Roy R, Bepari MS (2011) Designing voice enabled railway travel enquiry system: an IVR based approach on bangla ASR. ICON 2011, Anna University, Chennai, India, pp 138–145

    Google Scholar 

  8. Basu J, Bepari MS, Roy R, Khan S (2012) Design of telephonic speech data collection and transcription methodology for speech recognition systems. FRSM 2012, KIIT, Gurgaon, pp 147–153

    Google Scholar 

  9. Basu J, Basu T, Mitra M, Das Mandal S (2009) Grapheme to Phoneme (G2P) conversion for bangla. O-COCOSDA international conference, pp 66–71

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joyanta Basu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer India

About this paper

Cite this paper

Basu, J., Bepari, M.S., Roy, R., Khan, S. (2013). Real Time Challenges to Handle the Telephonic Speech Recognition System. In: S, M., Kumar, S. (eds) Proceedings of the Fourth International Conference on Signal and Image Processing 2012 (ICSIP 2012). Lecture Notes in Electrical Engineering, vol 222. Springer, India. https://doi.org/10.1007/978-81-322-1000-9_38

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1000-9_38

  • Published:

  • Publisher Name: Springer, India

  • Print ISBN: 978-81-322-0999-7

  • Online ISBN: 978-81-322-1000-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics