Abstract
In this paper we address two symmetrical issues, the discovery of similarities among classification algorithms, and among datasets. Both on the basis of error measures, which we use to define the error correlation between two algorithms, and determine the relative performance of a list of algorithms. We use the first to discover similarities between learners, and both of them to discover similarities between datasets. The latter sketch maps on the dataset space. Regions within each map exhibit specific patterns of error correlation or relative performance. To acquire an understanding of the factors determining these regions we describe them using simple characteristics of the datasets. Descriptions of each region are given in terms of the distributions of dataset characteristics within it.
Article PDF
Similar content being viewed by others
References
Agresti, A. (1990). Categorical Data Analysis. John Wiley and Sons: Series in Probability and Mathematical Statistics.
Aha, D. (1992). Generalizing from case studies: A case study. In D. Sleeman & P. Edwards (Eds.), Proceedings of the 9th International Machine Learning Conference (pp. 1–10). Morgan Kaufman.
Ali, K., & Pazzani, M. (1996). Error reduction through learning multiple descriptions. Machine Learning, 24, 1996.
Bay, S., & Pazzani, M. (2000). Characterizing model errors and differences. In P. Langley (Ed.), Proceedings of the 17th International Conference on Machine Learning (pp. 49–56). Morgan Kaufman.
Bensusan, H. (1999). Automatic Bias Learning: An Inquiry Into the Inductive Basis of Induction. Doctoral dissertation, University Of Sussex, Cognitive Science.
Bock, H., & Diday, E. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer-Verlag.
Brazdil, P., Carlos, S., & Costa, J. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50, 251–277.
Brodley, C. (1994). Recursive Automatic Algorithm Selection for Inductive Learning. Doctoral dissertation, University of Massachusetts.
Cohen, W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proceedings of the 12th International Conference on Machine Learning (pp. 115–123). Morgan Kaufman.
Cristianini, N., & Shawe-Taylor, J. (2002). An Introduction to Support Vector Machines. Cambridge University Press.
Duda, R., Hart, P., & Stork, D. (2001). Pattern Classification and Scene Analysis. John Willey and Sons.
Gama, J., & Brazdil, P. (1995). Characterization of classification algorithms. In C. Pinto-Ferreira & N. Mamede (Eds.), Proceedings of the 7th Portugese Conference in AI, EPIA 95 (pp. 83–102). Springer-Verlag.
Gama, J., & Brazdil, P. (1999). Linear tree. Intelligent Data Analysis, 3, 1–22.
Hirschberg, D. S., Pazzani, M. J., & Ali, K. M. (1994). Average Case Analysis of k-CNF and k-DNF Learning Algorithms, vol. II: Intersections Between Theory and Experiment, 15–28. MIT Press.
Kalousis, A., & Hilario, M. (2001a). Feature selection for meta-learning. In D. Cheung, G. Williams & Q. Li (Eds.), Proceedings of the 5th Pasific Asia Conference on Knowledge Discovery and Data Mining (pp. 222–233). Springer.
Kalousis, A., & Hilario, M. (2001b). Model selection via meta-learning: A comparative study. International Journal On Artificial Intelligence Tools, 10.
Kalousis, A., & Theoharis, T. (1999). Noemon: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3, 319–337.
Langley, P., & Sage, S. (1999). Tractable average-case analysis of naive Bayesian classifiers. In I. Bratko & S. Dzeroski (Eds.), Proceedings of the 16th International Conference on Machine Learning (pp. 220–228). Morgan Kaufmann.
Langley, P., & Iba, W. (1993). Average-case analysis of a nearest neighbor algorithm. In R. Bajcsy (Ed.), Proceedings of the 13th International Joint Conference on AI (pp. 889–894). Morgan Kaufmann.
METAL project (2002). http://www.metal-kdd.org.
Michie, D., Spiegelhalter, D., & Taylor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence.
Peng, Y., Flach, P., Soares, C., & Brazdil, P. (2002). Improved dataset characterisation for meta-learning. In S. Lange & K. Satoh (Eds.), Proceedings of the 5th International Conference on Discovery Science 2002 (pp. 141–152). Springer-Verlag.
Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. (2000). Tell me who can learn you and I can tell you who you are: Landmarking various learning algorithms. In P. Langley (Ed.), Proceedings of the 17th International Conference on Machine Learning (pp. 743–750). Morgan Kaufman.
Quinlan, J. (1992). C4.5: Programs for Machine Learning. Morgan Kaufman Publishers.
Quinlan, R. (2000). http://www.rulequest.com/see5-info.html.
Scheffer, T. (2000). Average-case analysis of classification algorithms for boolean functions and decision trees. In H. Arimura, S. Jain & A. Sharma (Eds.), Proceedings of the 11th International Conference Algorithmic Learning Theory (pp. 194–208). Springer Verlag, LNCS 1968.
Soares, C., & Brazdil, P. (2000). Zoomed ranking: Selection of classification algrorithms based on relevant performance information. In D. Zighed, J. Komorowski & J. Zytkow (Eds.), Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 126–135). Springer.
Todorovski, L., Blockeel, H., & Dzeroski, S. (2002). Ranking with predictive clustering trees. In T. Elomaa, H. Mannila and H. Toivonen (Eds.), Proceedings of the 13th European Conference on Machine Learning (pp. 444–455). Springer.
Todorovski, L., & Dzeroski, S. (1999). Experiments in meta-level learning with ILP. In J. Zytkow & J. Rauch (Eds.), Proceedings of the 3rd European Conference on Principles of Data Mining and Knowledge Discovery (pp. 98–106). Springer.
Tumer, K., & Ghosh, J. (1995). Classifier combining: Analytical results and implications. In In AAAI-95-Workshop in Induction of Multiple Learning Models.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kalousis, A., Gama, J. & Hilario, M. On Data and Algorithms: Understanding Inductive Performance. Machine Learning 54, 275–312 (2004). https://doi.org/10.1023/B:MACH.0000015882.38031.85
Issue Date:
DOI: https://doi.org/10.1023/B:MACH.0000015882.38031.85