Skip to main content

Routing Based Context Selection for Document-Level Neural Machine Translation

  • Conference paper
  • First Online:
Machine Translation (CCMT 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1464))

Included in the following conference series:

  • 276 Accesses

Abstract

Most of the existing methods of document-level neural machine translation (NMT) integrate more textual information by extending the scope of sentence encoding. Usually, the sentence-level representation is incorporated (via attention or gate mechanism) in these methods, which makes them straightforward but rough, and it is difficult to distinguish useful contextual information from noises. Furthermore, the longer the encoding length is, the more difficult it is for the model to grasp the inter-dependency between sentences. In this paper, a document-level NMT method based on a routing algorithm is presented, which can automatically select context information. The routing mechanism endows the current source sentence with the ability to decide which words can become its context. This leads the method to merge the inter-sentence dependencies in a more flexible and elegant way, and model local structure information more effectively. At the same time, this structured information selection mechanism will also alleviate the possible problems caused by long-distance encoding. Experimental results show that our method is 2.91 BLEU higher than the Transformer model on the public dataset of ZH-EN, and is superior to most of the state-of-the-art document-level NMT models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1724–1734. Association for Computational Linguistics, October 2014

    Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  3. Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015)

    Google Scholar 

  4. Zhang, J., et al.: Improving the transformer translation model with document-level context. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 533–542 (2018)

    Google Scholar 

  5. Voita, E., Serdyukov, P., Sennrich, R., Titov, I.: Context-aware neural machine translation learns anaphora resolution. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1264–1274 (2018)

    Google Scholar 

  6. Agrawal, R., Turchi, M., Negri, M.: Contextual handling in neural machine translation: look behind, ahead and on both sides. In: 21st Annual Conference of the European Association for Machine Translation, p. 11 (2018)

    Google Scholar 

  7. Guillou, L., Hardmeier, C., Lapshinova-Koltunski, E., Loáiciga, S.: A pronoun test suite evaluation of the English-German MT systems at WMT 2018. In: WMT 2018, p. 570 (2018)

    Google Scholar 

  8. Voita, E., Sennrich, R., Titov, I.: Context-aware monolingual repair for neural machine translation. In: EMNLP/IJCNLP (1) (2019)

    Google Scholar 

  9. Zheng, Z., Yue, X., Huang, S., Chen, J., Birch, A.: Towards making the most of context in neural machine translation. In: IJCAI (2020)

    Google Scholar 

  10. Voita, E., Sennrich, R., Titov, I.: When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1198–1212 (2019)

    Google Scholar 

  11. Jean, S., Lauly, S., Firat, O., Cho, K.: Does neural machine translation benefit from larger context? arXiv preprint arXiv:1704.05135 (2017)

  12. Wang, L., Tu, Z., Way, A., Liu, Q.: Exploiting cross-sentence context for neural machine translation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2826–2831 (2017)

    Google Scholar 

  13. Miculicich, L., Ram, D., Pappas, N., Henderson, J.: Document-level neural machine translation with hierarchical attention networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2947–2954 (2018)

    Google Scholar 

  14. Tu, Z., Liu, Y., Shi, S., Zhang, T.: Learning to remember translation history with a continuous cache. Trans. Assoc. Comput. Linguist. 6, 407–420 (2018)

    Article  Google Scholar 

  15. Kim, Y., Tran, D.T., Ney, H.: When and why is document-level context useful in neural machine translation? In: Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), pp. 24–34 (2019)

    Google Scholar 

  16. Zhang, B., Bapna, A., Sennrich, R., Firat, O.: Share or not? Learning to schedule language-specific capacity for multilingual translation (2020)

    Google Scholar 

  17. Maruf, S., Haffari, G.: Document context neural machine translation with memory networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1275–1284 (2018)

    Google Scholar 

  18. Tiedemann, J., Scherrer, Y.: Neural machine translation with extended context. In: Proceedings of the Third Workshop on Discourse in Machine Translation, pp. 82–92 (2017)

    Google Scholar 

  19. Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: ACL 2017, p. 28 (2017)

    Google Scholar 

  20. Sukhbaatar, S., Grave, É., Bojanowski, P., Joulin, A.: Adaptive attention span in transformers. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 331–335 (2019)

    Google Scholar 

  21. Maruf, S., Martins, A.F., Haffari, G.: Selective attention for context-aware neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), vol. 1, pp. 3092–3102 (2019)

    Google Scholar 

  22. Yang, Z., Zhang, J., Meng, F., Gu, S., Feng, Y., Zhou, J.: Enhancing context modeling with a query-guided capsule network for document-level translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1527–1537 (2019)

    Google Scholar 

  23. Li, B., et al.: Does multi-encoder help? a case study on context-aware neural machine translation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3512–3518 (2020)

    Google Scholar 

  24. Cao, Q., Xiong, D.: Encoding gated translation memory into neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3042–3047 (2018)

    Google Scholar 

  25. Kuang, S., Xiong, D.: Fusing recency into neural machine translation with an inter-sentence gate model. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 607–617 (2018)

    Google Scholar 

  26. Jiang, S.: Document-level neural machine translation with inter-sentence attention. arXiv preprint arXiv:1910.14528 (2019)

  27. Xiong, H., He, Z., Wu, H., Wang, H.: Modeling coherence for discourse neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7338–7345 (2019)

    Google Scholar 

  28. Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)

    Google Scholar 

  29. Arivazhagan, N., et al.: Massively multilingual neural machine translation in the wild: findings and challenges. arXiv preprint arXiv:1907.05019 (2019)

  30. Cettolo, M., Girardi, C., Federico, M.: WIT3: Web inventory of transcribed and translated talks. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation, Trento, Italy, 28–30 May 2012, pp. 261–268. European Association for Machine Translation (2012)

    Google Scholar 

  31. Koehn, P.: Europarl: a parallel corpus for statistical machine translation. Mt Summit, vol. 5 (2008)

    Google Scholar 

  32. Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, pp. 177–180. Association for Computational Linguistics, June 2007

    Google Scholar 

  33. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725 (2016)

    Google Scholar 

  34. Ott, M.: fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pp. 48–53 (2019)

    Google Scholar 

  35. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Computer Science (2014)

    Google Scholar 

  36. Post, M.: A call for clarity in reporting bleu scores. In: Proceedings of the Third Conference on Machine Translation: Research Papers, pp. 186–191 (2018)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank the organizers of CCMT 2021 and the reviewers for their helpful suggestions. This research work is supported by the National Key Research and Development Program of China under Grant No. 2017YFB1002103.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Jian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fei, W., Jian, P., Zhu, X., Lin, Y. (2021). Routing Based Context Selection for Document-Level Neural Machine Translation. In: Su, J., Sennrich, R. (eds) Machine Translation. CCMT 2021. Communications in Computer and Information Science, vol 1464. Springer, Singapore. https://doi.org/10.1007/978-981-16-7512-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-7512-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-7511-9

  • Online ISBN: 978-981-16-7512-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics