Skip to main content
Log in

A Novel Architecture with Separate Comparison and Interaction Modules for Chinese Semantic Sentence Matching

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

In Chinese semantic sentence matching, existing models use the same architecture to distinguish the semantic differences and extract interaction information simultaneously. However, not only it brings tremendous redundant information but makes the model more overweight and sophisticated. To relieve this condition, a deep architecture with the comparison and interaction modules separated named SNMA is presented in this paper. The SNMA uses the Siamese network to extract context information, and employs the multi-head attention mechanism to extract interaction information from sentence pairs separately. Experimental results on four recent Chinese sentence matching datasets outline the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Wang ZG, Mi HT, Ittycheriah A (2016) Sentence similarity learning by lexical decomposition and composition. In: Proceedings of the 26th International Conference on Computational Linguistics, pp 1340–1349

  2. Yin WP, Schutze H, Xiang B, Zhou BW (2018) ABCNN: attention-based convolutional neural network for modeling sentence pairs. arXiv:1512.05193

  3. Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference.Computer Science. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 632–642. https://doi.org/10.18653/v1/D15-1075

  4. Bar D, Biemann C, Gurevych I, Zesch T (2012) Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp 435–440

  5. Jimenez S, Becerra C, Gelbukh A (2012) Soft cardinality: a parameterized similarity function for text comparison. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp 449–453

  6. Bromley B, Guyon I et al (1993) Signature verification using a siamese time delay neural network. Adv Neural Inform Process Syst 6:737–744

    Google Scholar 

  7. Chen Q, Zhu XD, Ling ZH, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1657–1668. https://doi.org/10.18653/v1/P17-1152

  8. Pang L, Lan YY, Guo JF, Xu J, Wan SX, Cheng XQ (2016) Text matching as image recognition. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp 1145–1152

  9. Kim S, Kang I, Kwak N (2018) Semantic sentence matching with densely-connected recurrent and co-attentive information. arXiv preprint arXiv:1805.11360

  10. Wan SX, Lan YY, Guo JF (2015) A deep architecture for semantic matching with multiple positional sentence representations. arXiv preprint arXiv:1511.08277

  11. Lai HY, Tao YZ, Wang CL (2020) Bi-directional attention comparison for semantic sentence matching. Multimedia Tools Appl 79(4):14609–14624. https://doi.org/10.1007/s11042-018-7063-5

    Article  Google Scholar 

  12. Bill D, Chris Q, and Chris B (2004) Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of COLING, pp 350–356. https://www.aclweb.org/anthology/C04-1051

  13. Cer D, Diab M, Agirre E, LopezGazpio I, Specia L (2017) Semeval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation. In: Proceedings of the 10th International Workshop on Semantic Evaluation. arXiv:1708.00055

  14. Nakov P, Hoogeveen D, MArquez L, Moschitti A, Mubarak H, Baldwin T, Verspoor K (2019) SemEval-2017 task 3: Community question answering. arXiv preprint arXiv:1912.00730

  15. Csernai K (2017) Quora question pair dataset. https://www.kaggle.com/c/quora-question-pairs/data

  16. Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 632–642

  17. Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426

  18. Ant Financial. Ant Financial Artificial Competition

  19. CCKS (2018) WeBank Intelligent Customer Service Question Matching Competition. https://biendata.com/competition/CCKS2018\(\_\)3

  20. PPDAI 3rd Magic Mirror Data Application Contest. https://www.ppdai.ai/mirror/goToMirrorDetail?mirrorId=1

  21. CHIP 2018-4th China Health Information Processing Conference. https://biendata.com/competition/chip2018

  22. Wang ZG, Hamza W, Florian R (2017) Bilateral multi-perspective matching for natural language sentences. pp 4144–4150. https://doi.org/10.24963/ijcai.2017/579

  23. Parikh AP, Tckstrm O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. pp 2249–2255. https://doi.org/10.18653/v1/D16-1244

  24. Li XY, Meng YX, Sun XF, Han QH, Yuan A, Li JW (2019) Is word segmentation necessary for deep learning of Chinese representation. arXiv preprint arXiv: 1905.05526

  25. Junyi S. jieba. https://github.com/fxsjy/jieba

  26. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representation in vector space. arXiv preprint arXiv:1301.3781

  27. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  28. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salak-hutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  29. Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C (2015) Efficient object localization using convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 648–656

  30. Lei Ba J, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv: 1607.06450

  31. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all your need. arXiv preprint arXiv:1706.03762

  32. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pp 315–323

  33. Timothy D (2016) Incorporating nesterov momentum into adam. In: Proceedings of Workshop Track International Conference on Learning Representations

Download references

Acknowledgements

Thanks to Yinxiang Xu for valuable discussion and to Xiaoning Song for helping us to polish the paper during the revision process.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Q., Sun, J. & Zhao, Y. A Novel Architecture with Separate Comparison and Interaction Modules for Chinese Semantic Sentence Matching. Neural Process Lett 53, 3677–3692 (2021). https://doi.org/10.1007/s11063-021-10561-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10561-3

Keywords

Navigation