Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation

Yao, Yushi; Huang, Zheng

doi:10.1007/978-3-319-46681-1_42

Yushi Yao¹⁹ &
Zheng Huang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9950))

Included in the following conference series:

International Conference on Neural Information Processing

3602 Accesses
61 Citations

Abstract

Recurrent neural network (RNN) has been broadly applied to natural language process (NLP) problems. This kind of neural network is designed for modeling sequential data and has been testified to be quite efficient in sequential tagging tasks. In this paper, we propose to use bi-directional RNN with long short-term memory (LSTM) units for Chinese word segmentation, which is a crucial task for modeling Chinese sentences and articles. Classical methods focus on designing and combining hand-craft features from context, whereas bi-directional LSTM network (BLSTM) does not need any prior knowledge or pre-designing, and is expert in creating hierarchical feature representation of contextual information from both directions. Experiment result shows that our approach gets state-of-the-art performance in word segmentation on both traditional Chinese datasets and simplified Chinese datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chang, P.-C., Galley, M., Manning, C.D.: Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 224–232. Association for Computational Linguistics (2008)
Google Scholar
Auli, M., Galley, M., Quirk, C., Zweig, G.: Joint language and translation modeling with recurrent neural networks. In: EMNLP, vol. 3 (2013)
Google Scholar
Zhang, H.-P., Hong-Kui, Y., Xiong, D.-Y., Liu, Q.: HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17, pp. 184–187. Association for Computational Linguistics (2003)
Google Scholar
Peng, F., Feng, F., McCallum, A.: Chinese segmentation and new word detection using conditional random fields. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 562. Association for Computational Linguistics (2004)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Sundermeyer, M., Ney, H., Schluter, R.: From feedforward to recurrent LSTM neural networks for language modeling. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 517–529 (2015)
Article Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Huang, Z., Wei, X., Kai, Y.: Bidirectional LSTM-CRF models for sequence tagging (2015). arXiv preprint: arXiv:1508.01991
Ling, W., Luís, T., Marujo, L., Astudillo, R.F., Amir, S., Dyer, C., Black, A.W., Trancoso, I.: Finding function in form: compositional character models for open vocabulary word representation (2015). arXiv preprint: arXiv:1508.02096
Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for Chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)
Google Scholar
Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: INTERSPEECH (2012)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks (2013). arXiv preprint: arXiv:1312.6026
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint: arXiv:1409.1556
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Zhao, H., Huang, C.-N., Li, M.: An improved Chinese word segmentation system with conditional random field. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, vol. 1082117, July 2006
Google Scholar
Sun, W.: A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1385–1394. Association for Computational Linguistics (2011)
Google Scholar
Zhang, L., Houfeng, W., Sun, X., Mansur, M.: Exploring representations from unlabeled data with co-training for Chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai, China
Yushi Yao & Zheng Huang

Authors

Yushi Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Huang .

Editor information

Editors and Affiliations

The University of Tokyo , Tokyo, Japan
Akira Hirose
Kobe University , Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology , Ikoma, Japan
Kazushi Ikeda
Kyungpook National University , Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences , Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yao, Y., Huang, Z. (2016). Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-46681-1_42
Published: 30 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics