Skip to main content

A Study of Automatic Speech Recognition in Noisy Classroom Environments for Automated Dialog Analysis

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2015)

Abstract

The development of large-scale automatic classroom dialog analysis systems requires accurate speech-to-text translation. A variety of automatic speech recognition (ASR) engines were evaluated for this purpose. Recordings of teachers in noisy classrooms were used for testing. In comparing ASR results, Google Speech and Bing Speech were more accurate with word accuracy scores of 0.56 for Google and 0.52 for Bing compared to 0.41 for AT&T Watson, 0.08 for Microsoft, 0.14 for Sphinx with the HUB4 model, and 0.00 for Sphinx with the WSJ model. Further analysis revealed both Google and Bing engines were largely unaffected by speakers, speech class sessions, and speech characteristics. Bing results were validated across speakers in a laboratory study, and a method of improving Bing results is presented. Results provide a useful understanding of the capabilities of contemporary ASR engines in noisy classroom environments. Results also highlight a list of issues to be aware of when selecting an ASR engine for difficult speech recognition tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kelly, S.: Classroom discourse and the distribution of student engagement. Soc. Psychol. Educ. 10, 331–352 (2007)

    Article  Google Scholar 

  2. Sweigart, W.: Classroom Talk, Knowledge Development, and Writing. Res. Teach. Engl. 25, 469–496 (1991)

    Google Scholar 

  3. Juzwik, M.M., Borsheim-Black, C., Caughlan, S., Heintz, A.: Inspiring Dialogue: Talking to Learn in the English Classroom. Teachers College Press (2013)

    Google Scholar 

  4. Nystrand, M., Gamoran, A., Kachur, R., Prendergast, C.: Opening dialogue. Teachers College, Columbia University, New York (1997)

    Google Scholar 

  5. Gamoran, A., Kelly, S.: Tracking, instruction, and unequal literacy in secondary school English. In: Stab. Change Am. Educ. Struct. Process Outcomes, pp. 109–126 (2003)

    Google Scholar 

  6. Nystrand, M., Gamoran, A.: The big picture: Language and learning in hundreds of English lessons. In: Open. Dialogue., pp. 30–74 (1997)

    Google Scholar 

  7. Wang, Z., Pan, X., Miller, K.F., Cortina, K.S.: Automatic classification of activities in classroom discourse. Comput. Educ. 78, 115–123 (2014)

    Article  Google Scholar 

  8. Ford, M., Baer, C.T., Xu, D., Yapanel, U., Gray, S.: The LENA Language Environment Analysis System. Technical Report LTR-03-2. Boulder, CO: LENA Foundation (2008)

    Google Scholar 

  9. Litman, D.J., Silliman, S.: ITSPOKE: An intelligent tutoring spoken dialogue system. In: Demonstration Papers at HLT-NAACL 2004, Association for Computational Linguistics, pp. 5–8 (2004)

    Google Scholar 

  10. Mostow, J., Aist, G.: Evaluating tutors that listen: An overview of Project LISTEN (2001)

    Google Scholar 

  11. Schultz, K., Bratt, E.O., Clark, B., Peters, S., Pon-Barry, H., Treeratpituk, P.: A scalable, reusable spoken conversational tutor: Scot. In: Proceedings of the AIED 2003 Workshop on Tutorial Dialogue Systems: With a View toward the Classroom, pp. 367–377 (2003)

    Google Scholar 

  12. Ward, W., Cole, R., Bolaños, D., Buchenroth-Martin, C., Svirsky, E., Vuuren, S.V., Weston, T., Zheng, J., Becker, L.: My science tutor: A conversational multimedia virtual tutor for elementary school science. ACM Trans. Speech Lang. Process. TSLP. 7, 18 (2011)

    Google Scholar 

  13. Johnson, W.L., Valente, A.: Tactical Language and Culture Training Systems: using AI to teach foreign languages and cultures. AI Mag. 30, 72 (2009)

    Google Scholar 

  14. Morbini, F., Audhkhasi, K., Sagae, K., Artstein, R., Can, D., Georgiou, P., Narayanan, S., Leuski, A., Traum, D.: Which ASR should I choose for my dialogue system? In: Proceedings of the SIGDIAL 2013 Conference, Metz, pp. 394–403 (2013)

    Google Scholar 

  15. Samei, B., Olney, A., Kelly, S., Nystrand, M., D’Mello, S., Blanchard, N., Sun, X., Glaus, M., Graesser, A.: Domain independent assessment of dialogic properties of classroom discourse. In: Stamper, J., Pardos, Z., Mavrikis, M., McLaren, B.M., (Eds.) Proceedings of the 7th International Conference on Educational Data Mining, London, pp. 233–236 (2014)

    Google Scholar 

  16. Nystrand, M., Wu, L.L., Gamoran, A., Zeiser, S., Long, D.A.: Questions in time: Investigating the structure and dynamics of unfolding classroom discourse. Discourse Process. 35, 135–198 (2003)

    Article  Google Scholar 

  17. Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M., Strope, B.: Your word is my command: google search by voice: a case study. In: Advances in Speech Recognition, pp. 61–90. Springer (2010)

    Google Scholar 

  18. Microsoft: The Bing Speech Recognition Control (2014). http://www.bing.com/dev/en-us/speech. (accessed January 14, 2015)

  19. Goffin, V., Allauzen, C., Bocchieri, E., Hakkani-TĂ¼r, D., Ljolje, A., Parthasarathy, S., Rahim, M.G., Riccardi, G., Saraclar, M.: The AT&T WATSON speech recognizer. In: ICASSP (1), pp. 1033–1036 (2005)

    Google Scholar 

  20. Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P., Woelfel, J.: Sphinx-4: A flexible open source framework for speech recognition (2004)

    Google Scholar 

  21. Kelly, S., Majerus, R.: School-to-school variation in disciplined inquiry. Urban Educ. 0042085911413151 (2011)

    Google Scholar 

  22. D’Mello, S.K., Graesser, A., King, B.: Toward Spoken Human-Computer Tutorial Dialogues. Human-Computer Interact. 25, 289–323 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nathaniel Blanchard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Blanchard, N. et al. (2015). A Study of Automatic Speech Recognition in Noisy Classroom Environments for Automated Dialog Analysis. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M. (eds) Artificial Intelligence in Education. AIED 2015. Lecture Notes in Computer Science(), vol 9112. Springer, Cham. https://doi.org/10.1007/978-3-319-19773-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19773-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19772-2

  • Online ISBN: 978-3-319-19773-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics