Abstract
Case-Based Reasoning (CBR) solves problems by reusing past problem-solving experiences maintained in a casebase. The key CBR knowledge container therefore is its casebase. However there are further containers such as similarity, reuse and revision knowledge that are also crucial. Automated acquisition approaches are particularly attractive to discover knowledge for such containers. Majority of research in this area is focused on introspective algorithms to extract knowledge from within the casebase. However the rapid increase in Web applications has resulted in large volumes of user generated experiential content. This forms a valuable source of background knowledge for CBR system development. In this paper we present a novel approach to acquiring knowledge from Web pages. The primary knowledge structure is a dynamically generated taxonomy which once created can be used during the retrieve and reuse stages of the CBR cycle. Importantly this taxonomy is pruned according to a clustering-based sense disambiguation heuristic that uses similarity over the solution vocabulary of cases. Algorithms presented in the paper are applied to several online FAQ systems consisting of textual problem-solving cases. The goodness of generated taxonomies is evidenced by improved semantic comparison of text due to successful sense disambiguation resulting in higher retrieval accuracy. Our results show significant improvements over standard text comparison alternatives.
Supported by the Spanish Ministry of Science and Education (TIN2009-13692-C03-03) and the British Council grant (UKIERI RGU-IITM-0607-168E).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using word-net. In: Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, pp. 136–145 (2002)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Cilibrasi, R.L., Vitanyi, P.M.B.: The google similarity distance. IEEE Trans. on Knowl. and Data Eng. 19(3), 370–383 (2007)
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. Journal of AI Research 24, 305–339 (2005)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
DÃaz-Agudo, B., González-Calero, P.A., Recio-GarcÃa, J.A., Sánchez-Ruiz-Granados, A.A.: Building CBR systems with jcolibri. Sci. Comput. Program. 69(1-3), 68–75 (2007)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of The 20th International Joint Conference for Artificial Intelligence, Hyderabad, India (2007)
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics, pp. 539–545. Association for Computational Linguistics, Morristown (1992)
Keller, F., Lapata, M., Ourioupina, O.: Using the web to overcome data sparseness. In: EMNLP 2002: Proceedings of the ACL 2002 conference on Empirical Methods in Natural Language Processing, pp. 230–237. Association for Computational Linguistics, Morristown (2002)
Leake, D., Powell, J.: Knowledge planning and learned personalization for web-based case adaptation. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 284–298. Springer, Heidelberg (2008)
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In: Proc. of SIGDOC 1986: 5th International Conference on Systems Documentation, pp. 24–26 (1986)
Marta Sabou, M.D., Motta, E.: Exploring the semantic web as background knowledge for ontology matching, 156–190 (2008)
Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet: similarity - measuring the relatedness of concepts. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence, AAAI 2004 (2004)
Philipp Cimiano, S.H., Staab, S.: Towards the self-annotating web. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web, pp. 462–471. ACM, New York (2004)
Plaza, E.: Semantics and experience in the future web. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 44–58. Springer, Heidelberg (2008)
Recio-GarcÃa, J.A., DÃaz-Agudo, B., González-Calero, P.A., Sánchez-Ruiz-Granados, A.: Ontology based CBR with jcolibri. In: Applications and Innovations in Intelligent Systems XIV. SGAI 2006, pp. 149–162. Springer, Heidelberg (2006)
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Sanchez, D., Moreno, A.: Bringing taxonomic structure to large digital libraries. Int. J. Metadata Semant. Ontologies 2(2), 112–122 (2007)
Simpson, G.B.: Lexical ambiguity and its role in models of word recognition. Psychological Bulletin 92(2), 316–340 (1984)
Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: AAAI 2006: Proceedings of the 21st National Conference on Artificial Intelligence, pp. 1419–1424. AAAI Press, Menlo Park (2006)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. Academic Press, London (2006)
Weber, R.O., Ashley, K.D., Brüninghaus, S.: Textual case-based reasoning. The Knowledge Engineering Review 20(03), 255–260 (2006)
Wiratunga, N., Lothian, R., Chakraborty, S., Koychev, I.: Propositional approach to textual case indexing. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 380–391. Springer, Heidelberg (2005)
Wiratunga, N., Lothian, R., Massie, S.: Unsupervised feature selection for text data. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 340–354. Springer, Heidelberg (2006)
Zornitsa Kozareva, E.R., Hovy, E.: Semantic class learning from the web with hyponym pattern linkage graphs. In: Proceedings of ACL 2008: HLT, pp. 1048–1056. Association for Computational Linguistics, Columbus (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Recio-Garcia, J.A., Wiratunga, N. (2010). Taxonomic Semantic Indexing for Textual Case-Based Reasoning . In: Bichindaritz, I., Montani, S. (eds) Case-Based Reasoning. Research and Development. ICCBR 2010. Lecture Notes in Computer Science(), vol 6176. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14274-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-14274-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14273-4
Online ISBN: 978-3-642-14274-1
eBook Packages: Computer ScienceComputer Science (R0)