Attention Window Aware Encoder-Decoder Model for Spoken Language Understanding

Wang, Yiming; Rong, Wenge; Liu, Jingshuang; Han, Jingfei; Xiong, Zhang

doi:10.1007/978-3-319-77383-4_37

Yiming Wang¹⁹,
Wenge Rong¹⁹,
Jingshuang Liu¹⁹,
Jingfei Han¹⁹ &
…
Zhang Xiong¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10736))

Included in the following conference series:

Pacific Rim Conference on Multimedia

2325 Accesses

Abstract

Slot filling task, which aims to predict the semantic slot labels for each specific word in word sequence, is one of the main tasks in Spoken Language Understanding (SLU). In this paper, we propose a variation of encoder-decoder model for sequence labelling. To better use the label dependency feature and prevent overfitting, we use Long Short Term Memory (LSTM) as encoder and Gated Recurrent Unit (GRU) as decoder. We also enhance the model by employing the attention mechanism with attention window as a novel feature, which considers the particularity in slot filling task that each target label corresponds to the specific words and hidden units in the encoder. We test the proposed model using the standard ATIS corpus by adopting different size of attention window. The analysis of trends for the results using different attention window size has shown its application potential of attention window feature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). CoRR abs/1409.0473
Google Scholar
Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The ATIS spoken language systems pilot corpus. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 96–101 (1990)
Google Scholar
Józefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of 32nd International Conference on Machine Learning, pp. 2342–2350 (2015)
Google Scholar
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies (2001)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conference on Machine Learning, pp. 282–289 (2001)
Google Scholar
Lin, Y., Pang, Z., Wang, D., Zhuang, Y.: Task-driven visual saliency and attention-based visual question answering (2017). CoRR abs/1702.06700
Google Scholar
Liu, B., Lane, I.: Recurrent neural network structured output prediction for spoken language understanding. In: Proceedings of 2015 NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions (2015)
Google Scholar
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015)
Google Scholar
Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tür, D.Z., He, X., Heck, L.P., Tür, G., Yu, D., Zweig, G.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)
Article Google Scholar
Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Proceedings of 14th Annual Conference of the International Speech Communication Association, pp. 3771–3775 (2013)
Google Scholar
Shi, Y., Yao, K., Chen, H., Pan, Y., Hwang, M., Peng, B.: Contextual spoken language understanding using recurrent neural networks. In: Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5271–5275 (2015)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of 2014 Annual Conference on Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Vu, N.T.: Sequential convolutional neural networks for slot filling in spoken language understanding. In: Proceedings of 17th Annual Conference of the International Speech Communication Association, pp. 3250–3254 (2016)
Google Scholar
Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 78–83 (2013)
Google Scholar
Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: Proceedings of 2014 IEEE Spoken Language Technology Workshop, pp. 189–194 (2014)
Google Scholar
Yao, K., Zweig, G., Hwang, M., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Proceedings of 14th Annual Conference of the International Speech Communication Association, pp. 2524–2528 (2013)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the National Natural Science Foundation of China (No. 61332018).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Beihang University, Beijing, China
Yiming Wang, Wenge Rong, Jingshuang Liu, Jingfei Han & Zhang Xiong

Authors

Yiming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenge Rong
View author publications
You can also search for this author in PubMed Google Scholar
Jingshuang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jingfei Han
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenge Rong .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Bing Zeng
University of Chinese Academy of Sciences, Beijing, China
Qingming Huang
University of Ottawa, Ottawa, Ontario, Canada
Abdulmotaleb El Saddik
University of Electronic Science and Technology of China, Chengdu, China
Hongliang Li
Chinese Academy of Sciences, Beijing, China
Shuqiang Jiang
Harbin Institute of Technology, Harbin, China
Xiaopeng Fan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Rong, W., Liu, J., Han, J., Xiong, Z. (2018). Attention Window Aware Encoder-Decoder Model for Spoken Language Understanding. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-77383-4_37
Published: 10 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics