Neural Name Translation Improves Neural Machine Translation

Li, Xiaoqing; Yan, Jinghui; Zhang, Jiajun; Zong, Chengqing

doi:10.1007/978-981-13-3083-4_9

Xiaoqing Li^11,12,
Jinghui Yan¹⁴,
Jiajun Zhang^11,12 &
…
Chengqing Zong^11,12,13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 954))

Included in the following conference series:

China Workshop on Machine Translation

565 Accesses
2 Citations

Abstract

In order to control computational complexity, neural machine translation (NMT) systems convert all rare words outside the vocabulary into a single unk symbol. Previous solution (Luong et al. [1]) resorts to use multiple numbered unks to learn the correspondence between source and target rare words. However, testing words unseen in the training corpus cannot be handled by this method. And it also suffers from the noisy word alignment. In this paper, we focus on a major type of rare words – named entity (NE), and propose to translate them with character level sequence to sequence model. The NE translation model is further used to derive high quality NE alignment in the bilingual training corpus. With the integration of NE translation and alignment modules, our NMT system is able to surpass the baseline system by 2.9 BLEU points on the Chinese to English task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. arXiv preprint arXiv:1410.8206 (2014)
Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Huang, F., Vogel, S., Waibel, A.: Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization. In: Proceedings of the ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, pp. 9-16 (2003)
Google Scholar
Feng, D., Lv, Y., Zhou, M.: A new approach for English-Chinese named entity alignment. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 372–379 (2004)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311-318. Association for Computational Linguistics (2002)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363-370. Association for Computational Linguistics (2005)
Google Scholar
Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. arXiv preprint arXiv:1412.2007 (2014)
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Ling, W., Trancoso, I., Dyer, C., Black, A.W.: Character-based neural machine translation. arXiv preprint arXiv:1511.04586 (2015)
Knight, K., Graehl, J.: Machine transliteration. Comput. Linguist. 24(4), 599–612 (1998)
Google Scholar
Li, H., Zhang, M., Su, J.: A joint source-channel model for machine transliteration. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, pp. 159-166 (2004)
Google Scholar
Freitag, D., Khadivi, S.: A sequence alignment model based on the averaged perceptron. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 238-247 (2007)
Google Scholar
Deselaers, T., Hasan, S., Bender, O., Ney, H.: A deep learning approach to machine transliteration. In: Proceedings of the Fourth Workshop on Statistical Machine Translation, Association for Computational Linguistics, pp. 233-241 (2009)
Google Scholar
Hermjakob, U., Knight, K., Daumé III, H.: Name translation in statistical machine translation-learning when to transliterate. In: Proceedings of ACL 2008: HLT, pp. 389–397 (2008)
Google Scholar
Li, H., Zheng, J., Ji, H., Li, Q., Wang, W.: Name-aware machine translation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 604–614 (2013)
Google Scholar
Zhang, J., Zong, C., Li, S.: Sentence type based reordering model for statistical machine translation. In: Proceedings of the 22nd International Conference on Computational Linguistics, Vol. 1, pp. 1089-1096. Association for Computational Linguistics (2008)
Google Scholar
Zhang J, Zong C.: Bridging neural machine translation and bilingual dictionaries. arXiv preprint arXiv:1610.07272 (2016)
Cambria, E., Hussain, A., Durrani, T., Zhang, J.: Towards a Chinese common and common sense knowledge base for sentiment analysis. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds.) IEA/AIE 2012. LNCS (LNAI), vol. 7345, pp. 437–446. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31087-4_46
Chapter Google Scholar

Download references

Acknowledgement

The research work described in this paper has been supported by the National Key Research and Development Program of China under Grant No. 2016QY02D0303 and the Natural Science Foundation of China under Grant No. 61673380.

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China
Xiaoqing Li, Jiajun Zhang & Chengqing Zong
University of Chinese Academy of Sciences, Beijing, China
Xiaoqing Li, Jiajun Zhang & Chengqing Zong
CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China
Chengqing Zong
Beijing Jiaotong University, Beijing, China
Jinghui Yan

Authors

Xiaoqing Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinghui Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chengqing Zong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinghui Yan .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Nanjing University, Nanjing, China
Jiajun Chen
National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing, China
Jiajun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, X., Yan, J., Zhang, J., Zong, C. (2019). Neural Name Translation Improves Neural Machine Translation. In: Chen, J., Zhang, J. (eds) Machine Translation. CWMT 2018. Communications in Computer and Information Science, vol 954. Springer, Singapore. https://doi.org/10.1007/978-981-13-3083-4_9

Download citation

DOI: https://doi.org/10.1007/978-981-13-3083-4_9
Published: 09 January 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3082-7
Online ISBN: 978-981-13-3083-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics