Abstract
The establishment of speech acoustic model system based on Long Short-Term Memory (LSTM) makes further improvements for the speech recognition. However, the connectionist temporal classification (CTC) training method performances more better in directly corresponding to the phoneme sequence or bound sequence of the speech. This paper combines CTC and LSTM to establish a power dispatching speech recognition model and compares the LSTM-CTC methods with traditional GMM-HMM methods, RNN-based speech recognition methods, and unidirectional LSTM networks through experiments. The results show that the speech recognition framework of LSTM-CTC has higher precision than other methods, and also has strong generalization ability. The LSTM-CTC methods can provide higher speech recognition accuracy and are more suitable for speech recognition in power dispatching as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Velichko, V.M., Zagoruyko, N.G.: Automatic recognition of 200 words. Int. J. Man Mach. Stud. 2(3), 223–234 (1970)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Harma, A., Laine, U.K.: A comparison of warped and conventional linear predictive coding. IEEE Trans. Speech Audio Process. 9(5), 579–588 (2001)
Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)
Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–185 (1995)
Juang, B.H., Katagiri, S.: Discriminative learning for minimum error classification. IEEE Trans. Signal Process. 40(12), 3043–3054 (1992)
Young, S.J., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK version 3.4.1). Cambridge University (2009). http://htk.eng.cam.ac.uk
Mohamed, A., Dahl, G., Hinton, G.: Deep belief networks for phone recognition. In: Workshop on Deep Learning for Speech Recognition and Related Applications. MIT Press, Whistler (2009)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Fullwood, M.J., Liu, M.H., Pan, Y.F., Liu, J., Xu, H., Mohamed, Y.B., Chew, E.G.: An oestrogen-receptor-α-bound human chromatin interactome. Nature 462(7269), 58–64 (2009)
Graves, A., Ferńandez, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: International Conference on Machine Learning, pp. 369–376 (2006)
Senior, A., Sak, H., Quitry, F.D.C., Sainath, T., Rao, K.: Acoustic modelling with CD-CTC-SMBR LSTM RNNs. In: Automatic Speech Recognition and Understanding, pp. 604–609 (2016)
Senior, A., Sak, H., Shafran, I.: Context dependent phone models for LSTM RNN acoustic modelling. In: IEEE International Conference on Acoustics, pp. 4585–4589 (2015)
Sak, H., Senior, A., Rao, K., Beaufays, F.: Fast and accurate recurrent neural network acoustic models for speech recognition. In: INTERSPEECH 2015 Proceedings, pp. 1468–1472 (2015)
Acknowledgement
This paper is part research results of project ‘Natural language processing and machine learning technology in the application research of dispatching operation (SGHZ0000DKJS1700141)’, which is supported by the Foundation of Central Branch of National Power Net.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dou, J. et al. (2019). Deep Learning of Intelligent Speech Recognition in Power Dispatching. In: Xhafa, F., Patnaik, S., Tavana, M. (eds) Advances in Intelligent, Interactive Systems and Applications. IISA 2018. Advances in Intelligent Systems and Computing, vol 885. Springer, Cham. https://doi.org/10.1007/978-3-030-02804-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-02804-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02803-9
Online ISBN: 978-3-030-02804-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)