Tree Similarity Measurement for Classifying Questions by Syntactic Structures

Lin, Zhiwei; Wang, Hui; McClean, Sally

doi:10.1007/978-3-319-42297-8_36

Zhiwei Lin¹⁶,
Hui Wang¹⁶ &
Sally McClean¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9773))

Included in the following conference series:

International Conference on Intelligent Computing

2960 Accesses
1 Citations

Abstract

Question classification plays a key role in question answering systems as the classification result will be useful for effectively locating correct answers. This paper addresses the problem of question classification by syntactic structure. To this end, questions are converted into parsed trees and each corresponding parsed tree is represented as a multi-dimensional sequence (MDS). Under this transformation from questions to MDSs, a new similarity measurement for comparing questions with MDS representations is presented. The new measurement, based on the all common subsequences, is proved to be a kernel, and can be computed in quadratic time. Experiments with kNN and SVM classifiers show that the proposed method is competitive in terms of classification accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the trees, SBARQ, WHNP, and etc al are tags defined in Penn Treebank II.
2.
Available at http://l2r.cs.uiuc.edu/~cogcomp/Data/QA/QC/.

References

Augsten, N., Bhlen, M., Gamper, J.: Approximate matching of hierarchical data using pq-grams. In: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 2005, pp. 301–312. VLDB Endowment (2005)
Google Scholar
Collins, M., Duffy, N.: Convolution kernels for natural language. In: Advances in Neural Information Processing Systems, vol. 14, pp. 625–632. MIT Press (2001)
Google Scholar
Croce, D., Basili, R., Moschitti, A.: Semantic tree kernels for statistical natural language learning. In: Basili, R., Bosco, C., Delmonte, R., Moschitti, A., Simi, M. (eds.) Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project, pp. 93–113. Springer International Publishing, Cham (2015)
Google Scholar
Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, 1st edn. Addison-Wesley Publishing Company, Reading (2009)
Google Scholar
Elzinga, C., Rahmann, S., Wang, H.: Algorithms for subsequence combinatorics. Theor. Comput. Sci. 409(3), 394–404 (2008)
Article MathSciNet MATH Google Scholar
Feng, G., Xiong, K., Tang, Y., Cui, A., Bai, J., Li, H., Yang, Q., Li, M.: Question classification by approximating semantics. In: Proceedings of the 24th International Conference on World Wide Web, pp. 407–417, Companion. ACM, New York (2015)
Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Association for Computational Linguistics, Morristown, NJ, USA (2002)
Google Scholar
Lin, Z., Wang, H., McClean, S.: Measuring tree similarity for natural language processing based information retrieval. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 13–23. Springer, Heidelberg (2010)
Chapter Google Scholar
Lin, Z., Wang, H., McClean, S.: A multidimensional sequence approach to measuring tree similarity. IEEE Trans. Knowl. Data Eng. 24(2), 197–208 (2012)
Article Google Scholar
Mittendorfer, M., Winiwarter, W.: Exploiting syntactic analysis of queries for information retrieval. J. Data Knowl. Eng. 42(3), 315–325 (2002)
Article MATH Google Scholar
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)
Chapter Google Scholar
Moschitti, A.: Making tree kernels practical for natural language learning. In: Proceedings of the Eleventh International Conference on European Association for Computational Linguistics, Trento, Italy (2006)
Google Scholar
Moschitti, A., Quarteroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proceeding of the Association for Computational Linguistics, pp. 776–783 (2007)
Google Scholar
Pan, Y., Tang, Y., Lin, L., Luo, Y.: Question classification with semantic tree kernel. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 837–838. ACM, New York (2008)
Google Scholar
Punyakanok, V., Roth, D., Yih, W.-T.: Mapping dependencies trees: an application to question answering. In: Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics (2004)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Strzalkowski, T. (ed.): Natural language Information Retrieval. Kluwer, New York (1999)
MATH Google Scholar
Wang, H.: All common subsequences. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 635–640, Hyderabad, India (2007)
Google Scholar
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR 2003, pp. 26–32. ACM, New York (2003)
Google Scholar
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18(6), 1245–1262 (1989)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank anonymous reviewers for their helpful comments to this paper by pointing out relevant literature and a number of annoying flaws in the submission. This paper is partially sponsored by EU DESIREE project (http://www.desiree-project.eu/).

Author information

Authors and Affiliations

Faculty of Computing and Engineering, Ulster University, Coleraine, UK
Zhiwei Lin, Hui Wang & Sally McClean

Authors

Zhiwei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sally McClean
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiwei Lin .

Editor information

Editors and Affiliations

Tongji University , Shanghai, China
De-Shuang Huang
Inha University , Incheon, Korea (Republic of)
Kyungsook Han
Liverpool John Moores University , Liverpool, United Kingdom
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, Z., Wang, H., McClean, S. (2016). Tree Similarity Measurement for Classifying Questions by Syntactic Structures. In: Huang, DS., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2016. Lecture Notes in Computer Science(), vol 9773. Springer, Cham. https://doi.org/10.1007/978-3-319-42297-8_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-42297-8_36
Published: 12 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42296-1
Online ISBN: 978-3-319-42297-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics