Abstract
This paper deals with two relatively less well studied problems in Textual CBR, namely visualizing and evaluating complexity of textual case bases. The first is useful in case base maintenance, the second in making informed choices regarding case base representation and tuning of parameters for the TCBR system, and also for explaining the behaviour of different retrieval/classification techniques over diverse case bases. We present an approach to visualize textual case bases by “stacking” similar cases and features close to each other in an image derived from the case-feature matrix. We propose a complexity measure called GAME that exploits regularities in stacked images to evaluate the alignment between problem and solution components of cases. GAME class , a counterpart of GAME in classification domains, shows a strong correspondence with accuracies reported by standard classifiers over classification tasks of varying complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mitchell, T.: Machine Learning. Mc Graw Hill International (1997)
Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In: Proc. of ECML, pp. 137–142. ACM Press, New York (1998)
Chakraborti, S., Mukras, R., Lothian, R., Wiratunga, N., Watt, S., Harper, D.: Supervised Latent Semantic Indexing using Adaptive Sprinkling. In: Proc. IJCAI, pp. 1582–1587 (2007)
Lamontagne, L.: Textual CBR Authoring using Case Cohesion, in TCBR’06 - Reasoning with Text. In: Proc of the ECCBR 2006 Workshops, pp. 33–43 (2006)
Feldman, R., Sanger, J.: The Text Mining Handbook. Cambridge University Press, Cambridge (2007)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: ndexing by Latent Semantic Analysis. JASIST 41(6), 391–407 (1990)
Massie, S.: Complexity Modelling for Case Knowledge Maintenance in Case Based Reasoning, PhD Thesis, The Robert Gordon University (2006)
Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D., Stamatopoulos, P.: A Memory-based Approach to Anti-Spam Filtering for Mailing Lists. Information Retrieval 6, 49–73 (2003)
Delany, S.J., Cunningham, P.: An Analysis of Case-base Editing in a Spam Filtering System. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 128–141. Springer, Heidelberg (2004)
HCE visualization, HCI Lab, University of Maryland, http://www.cs.umd.edu/hcil/hce/
Delany, S.J., Bridge, D.: Feature-Based and Feature-Free Textual CBR: A Comparison in Spam Filtering. In: Proc. of Irish Conference on AI and Cognitive Science, pp. 244–253 (2006)
Vinay, V., Cox, I.J., Milic-Fralyling, N., Wood, K.: Measuring the Complexity of a Collection of Documents. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 107–118. Springer, Heidelberg (2006)
Díaz-Agudo, B., González-Calero, P.A.: Formal concept analysis as a support technique for CBR. Knowledge Based Syst. 14(3-4), 163–171 (2001)
Brüninghaus, S., Ashley, K.D.: The Role of Information Extraction for Textual CBR. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 74–89. Springer, Heidelberg (2001)
Chakraborti, S., Watt, S., Wiratunga, N.: Introspective Knowledge Acquisition in Case Retrieval Networks for Textual CBR. In: Proc. of the 9th UK CBR Workshop, pp. 51–61 (2004)
Wiratunga, N., Lothian, R., Chakraborti, S., Koychev, I.: A Propositional Approach to Textual Case Indexing. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 380–391. Springer, Heidelberg (2005)
White, R.W., Ruthven, I., Jose, J.M.: A Study of Factors Affecting the Utility of Implicit Relevance Feedback. In: Proc. of SIGIR 2005 (2005)
Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Bille, P.: A survey of tree edit distance and related problems. Theoretical Computer Science 337(1-3), 217–239 (2005)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press and McGraw-Hill (2001)
Berry, M., Dumais, S., O‘Brien, G.: Using linear algebra for intelligent information retrieval. SIAM Rev. 37, 573–595 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chakraborti, S., Cerviño Beresi, U., Wiratunga, N., Massie, S., Lothian, R., Khemani, D. (2008). Visualizing and Evaluating Complexity of Textual Case Bases. In: Althoff, KD., Bergmann, R., Minor, M., Hanft, A. (eds) Advances in Case-Based Reasoning. ECCBR 2008. Lecture Notes in Computer Science(), vol 5239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85502-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-85502-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85501-9
Online ISBN: 978-3-540-85502-6
eBook Packages: Computer ScienceComputer Science (R0)