Skip to main content

Tree Similarity Measurement for Classifying Questions by Syntactic Structures

  • Conference paper
  • First Online:
Intelligent Computing Methodologies (ICIC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9773))

Included in the following conference series:

Abstract

Question classification plays a key role in question answering systems as the classification result will be useful for effectively locating correct answers. This paper addresses the problem of question classification by syntactic structure. To this end, questions are converted into parsed trees and each corresponding parsed tree is represented as a multi-dimensional sequence (MDS). Under this transformation from questions to MDSs, a new similarity measurement for comparing questions with MDS representations is presented. The new measurement, based on the all common subsequences, is proved to be a kernel, and can be computed in quadratic time. Experiments with kNN and SVM classifiers show that the proposed method is competitive in terms of classification accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the trees, SBARQ, WHNP, and etc al are tags defined in Penn Treebank II.

  2. 2.

    Available at http://l2r.cs.uiuc.edu/~cogcomp/Data/QA/QC/.

References

  1. Augsten, N., Bhlen, M., Gamper, J.: Approximate matching of hierarchical data using pq-grams. In: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 2005, pp. 301–312. VLDB Endowment (2005)

    Google Scholar 

  2. Collins, M., Duffy, N.: Convolution kernels for natural language. In: Advances in Neural Information Processing Systems, vol. 14, pp. 625–632. MIT Press (2001)

    Google Scholar 

  3. Croce, D., Basili, R., Moschitti, A.: Semantic tree kernels for statistical natural language learning. In: Basili, R., Bosco, C., Delmonte, R., Moschitti, A., Simi, M. (eds.) Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project, pp. 93–113. Springer International Publishing, Cham (2015)

    Google Scholar 

  4. Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, 1st edn. Addison-Wesley Publishing Company, Reading (2009)

    Google Scholar 

  5. Elzinga, C., Rahmann, S., Wang, H.: Algorithms for subsequence combinatorics. Theor. Comput. Sci. 409(3), 394–404 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Feng, G., Xiong, K., Tang, Y., Cui, A., Bai, J., Li, H., Yang, Q., Li, M.: Question classification by approximating semantics. In: Proceedings of the 24th International Conference on World Wide Web, pp. 407–417, Companion. ACM, New York (2015)

    Google Scholar 

  7. Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Association for Computational Linguistics, Morristown, NJ, USA (2002)

    Google Scholar 

  8. Lin, Z., Wang, H., McClean, S.: Measuring tree similarity for natural language processing based information retrieval. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 13–23. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Lin, Z., Wang, H., McClean, S.: A multidimensional sequence approach to measuring tree similarity. IEEE Trans. Knowl. Data Eng. 24(2), 197–208 (2012)

    Article  Google Scholar 

  10. Mittendorfer, M., Winiwarter, W.: Exploiting syntactic analysis of queries for information retrieval. J. Data Knowl. Eng. 42(3), 315–325 (2002)

    Article  MATH  Google Scholar 

  11. Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Moschitti, A.: Making tree kernels practical for natural language learning. In: Proceedings of the Eleventh International Conference on European Association for Computational Linguistics, Trento, Italy (2006)

    Google Scholar 

  13. Moschitti, A., Quarteroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proceeding of the Association for Computational Linguistics, pp. 776–783 (2007)

    Google Scholar 

  14. Pan, Y., Tang, Y., Lin, L., Luo, Y.: Question classification with semantic tree kernel. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 837–838. ACM, New York (2008)

    Google Scholar 

  15. Punyakanok, V., Roth, D., Yih, W.-T.: Mapping dependencies trees: an application to question answering. In: Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics (2004)

    Google Scholar 

  16. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  17. Strzalkowski, T. (ed.): Natural language Information Retrieval. Kluwer, New York (1999)

    MATH  Google Scholar 

  18. Wang, H.: All common subsequences. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 635–640, Hyderabad, India (2007)

    Google Scholar 

  19. Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR 2003, pp. 26–32. ACM, New York (2003)

    Google Scholar 

  20. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18(6), 1245–1262 (1989)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank anonymous reviewers for their helpful comments to this paper by pointing out relevant literature and a number of annoying flaws in the submission. This paper is partially sponsored by EU DESIREE project (http://www.desiree-project.eu/).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiwei Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Lin, Z., Wang, H., McClean, S. (2016). Tree Similarity Measurement for Classifying Questions by Syntactic Structures. In: Huang, DS., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2016. Lecture Notes in Computer Science(), vol 9773. Springer, Cham. https://doi.org/10.1007/978-3-319-42297-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42297-8_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42296-1

  • Online ISBN: 978-3-319-42297-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics