Real Time Challenges to Handle the Telephonic Speech Recognition System

Basu, Joyanta; Bepari, Milton Samirakshma; Roy, Rajib; Khan, Soma

doi:10.1007/978-81-322-1000-9_38

Joyanta Basu³,
Milton Samirakshma Bepari³,
Rajib Roy³ &
…
Soma Khan³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 222))

1567 Accesses
2 Citations

Abstract

Present paper describes the real time challenges to design the telephonic Automatic Speech Recognition (ASR) System. Telephonic speech data are collected automatically from all geographical regions of West Bengal to cover major dialectal variations of Bangla spoken language. All incoming calls are handled by Asterisk Server i.e. Computer telephony interface (CTI). The system asks some queries and users’ spoken responses are stored and transcribed manually for ASR system training. At the time of application of telephonic ASR, users’ voice queries are passed through the Signal Analysis and Decision (SAD) Module and after getting its decision speech signal may enter into the back-end Automatic Speech Recognition (ASR) Engine and relevant information is automatically delivered to the user. In real time scenario, the telephonic speech contains channel drop, silence or no speech event, truncated speech signal, noisy signal etc. along with the desired speech event. This paper deals with some techniques which will handle such unwanted signals in case of telephonic speech to certain extent and able to provide almost desired speech signal for the ASR system. Real time telephonic ASR system performance is increased by 8.91 % after implementing SAD module.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lee K-M, Lai J (2005) Speech vs. touch: a comparative study of the use of speech and DTMF keypad for navigation. International Journal of Human Computer Interaction IJHCI, vol 19(3)
Google Scholar
Furui S (2000) Speech recognition technology in the ubiquitous/wearable computing environment. In: Proceedings of the international conference on acoustics speech and signal processing, pp 3735–3738
Google Scholar
Maes SH, Chazan D, Cohen G, Hoory R (2000) Conversational networking: conversational protocols for transport, coding, and control. In: Proceedings of the international conference on spoken language processing
Google Scholar
Gomillion D, Dempster B Building telephony system with asterisk. ISBN: 1-904811-15-9, Packet Publishing Ltd
Google Scholar
Meggelen JV, Madsen L, Smith J Asterisk: the future of telephony, ISBN-10: 0-596-51048-9, ISBN-13: 987-0-596-51048-0, O’REILL
Google Scholar
http://www.speech.cs.cmu.edu/
Basu J, Khan S, Roy R, Bepari MS (2011) Designing voice enabled railway travel enquiry system: an IVR based approach on bangla ASR. ICON 2011, Anna University, Chennai, India, pp 138–145
Google Scholar
Basu J, Bepari MS, Roy R, Khan S (2012) Design of telephonic speech data collection and transcription methodology for speech recognition systems. FRSM 2012, KIIT, Gurgaon, pp 147–153
Google Scholar
Basu J, Basu T, Mitra M, Das Mandal S (2009) Grapheme to Phoneme (G2P) conversion for bangla. O-COCOSDA international conference, pp 66–71
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Development of Advanced Computing, Kolkata, India
Joyanta Basu, Milton Samirakshma Bepari, Rajib Roy & Soma Khan

Authors

Joyanta Basu
View author publications
You can also search for this author in PubMed Google Scholar
Milton Samirakshma Bepari
View author publications
You can also search for this author in PubMed Google Scholar
Rajib Roy
View author publications
You can also search for this author in PubMed Google Scholar
Soma Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joyanta Basu .

Editor information

Editors and Affiliations

, Computer Science & Engineering, Dr. N.G.P. Institute of Technology, Kalapatti Road, Coimbatore, 641048, Tamil Nadu, India
Mohan S
, Electronics & Communication Engineering, Dr. N.G.P. Institute of Technology, Kalapatti Road, Coimbatore, 641048, Tamil Nadu, India
S Suresh Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basu, J., Bepari, M.S., Roy, R., Khan, S. (2013). Real Time Challenges to Handle the Telephonic Speech Recognition System. In: S, M., Kumar, S. (eds) Proceedings of the Fourth International Conference on Signal and Image Processing 2012 (ICSIP 2012). Lecture Notes in Electrical Engineering, vol 222. Springer, India. https://doi.org/10.1007/978-81-322-1000-9_38

Download citation

DOI: https://doi.org/10.1007/978-81-322-1000-9_38
Published: 11 January 2013
Publisher Name: Springer, India
Print ISBN: 978-81-322-0999-7
Online ISBN: 978-81-322-1000-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics