Abstract
One approach to induction is to develop a decision tree from a set of examples. When used with noisy rather than deterministic data, the method involve-three main stages—creating a complete tree able to classify all the examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper is concerned with the first stage — tree creation which relies on a measure for “goodness of split,” that is, how well the attributes discriminate between classes. Some problems encountered at this stage are missing data and multi-valued attributes. The paper considers a number of different measures and experimentally examines their behavior in four domains. The results show that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly.
Article PDF
Similar content being viewed by others
References
Bratko, I., & Kononenko, I. (1986). Learning diagnostic rules from incomplete and noisy data.Seminar on AI Methods in Statistics. London: Unicom Seminars Ltd.
Bratko, I., & Lavrac, N. (Eds.). (1987).Progress in machine learning. Wilmslow, England: Sigma Press.
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984).Classification and regression trees. Belmont, CA: Wadsworth International Group.
Bundy, A., Silver, B., & Plummer, D. (1985). An analytical comparison of some rule-learning programs.Artificial Intelligence,27, 137–181.
Cendrowska, J. (1987). PRISM: An algorithm for inducing modular rules.International Journal of Man-Machine Studies,27, 349–370.
Cook, D., Craven, A., & Clarke, G. (1985).Statistical compating in Pascal London: Edward Arnold.
Corlett, R. (1983). Explaining induced decision trees.Proceedings of the Third Technical Conference of the BCS Expert Systems Group. London: British Computer Society.
Hart, A. (1984). Experience in the use of an inductive system in knowledge engineering. In M., Bramer (Ed.),Research and developments in expert systems. Cambridge: Cambridge University Press.
Hunt, E., Marin, J., & Stone, P. (1966).Experiments in induction. New York: Academic Press.
Kendall, M., & Stewart, A. (1976).The advanced theory of statistics (Vol. 3), London: Griffin.
Kononenko, I., Bratko, I., & Roskar, E. (1984).Experiments in automatic learning of medical diagnostic rules (Technical report). Ljubljana. Yugoslavia: Jozef Stefan Institute.
Kullback, S. (1967).Information theory and statistics. New York: Dover.
Marshall, R. (1986). Partitioning methods for classification and decision making in medicine.Statistics in Medicine,5, 517–526.
Michalski, R. S. (1978).Designing extended entry decision tables and optimal decision trees using decision diagrams (Technical Report No. 898). Urbana: University of Illinois, Department of Computer Science.
Michalski, R. S., & Chilausky, C. (1980). Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis.International Journal of Policy Analysis and Information Systems,4, 125–161.
Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (Eds.) (1983).Machine learning: An artificial intelligence approach. Los Altos, CA: Morgan Kaufmann.
Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (Eds.) (1986).Machine learning: An artificial intelligence approach (Vol. 2). Los Altos, CA: Morgan Kaufmann.
Mingers, J. (1986a). Inducing rules for expert systems — statistical aspects.The Professional Statistician,5, 19–24.
Mingers, J. (1986b). Expert systems — experiments with rule induction.Journal of the Operational Research Society,37, 1031–1037.
Mingers, J. (1987a). Expert systems — rule induction with statistical dataJournal of the Operational Research Society,38, 39–47.
Mingers, J. (1987b). Rule induction with statistical data — a comparison with multiple regression.Journal of the Operational Research Society,38, 347–352.
Mingers, J. (1988).A comparison of methods of pruning induced rule trees (Technical Report). Coventry, England: University of Warwick, School of Industrial and Business Studies.
Quinlan, J. R. (1979). Discovering rules from large collections of examples: A case study. In D., Michie (Ed.),Expert systems in the micro electronic age. Edinburgh: Edinburgh University Press.
Quinlan, J. R. (1983). Learning efficient classification procedures and their application to chess end games. In R. S., Michalski, J. G., Carbonell, & T. M., Mitchell (Eds.),Machine learning: An artificial intelligence approach. Los Altos: Morgan Kaufmann.
Quinlan, J. R. (1985). Decision trees and multi-valued attributes. In J., Haves & D., Michie (Eds.),Machine intelligence (Vol. 11). Chichester, England: Ellis Horwood.
Quinlan, J. R. (1986a). The effect of noise on concept learning. In R. S., Michalski, J. G., Carbonell, & T. M., Mitchell (Eds.)Machine learning: An artificial intelligence approach (Vol. 2). Los Altos: Morgan Kaufmann.
Quinlan, J. R. (1986b). Induction of decision trees.Machine Learning,1, 81–106.
Quinlan, J. R. (1987). Simplifying decision trees.International Journal of Man-Machine Studies,27, 221–234.
Schlimmer, J. C. & Fisher, D. (1986). A case study of incremental concept induction.Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 496–501). Philadelphia, PA: Morgan Kaufmann.
Shepherd, B. (1983). An appraisal of a decision-tree approach to image classification.Proceedings of the Eighth International Joint Conference on Artificial Intelligence (pp. 473–475). Karlsruhe, West Germany: Morgan Kaufmann.
Sokal, R., & Rohlf, F. (1981).Biometry. San Francisco: Freeman.
Titterington, D., Murray, L., Murray, G., Spiegelhalter, D., Skene, A., Habbema, J., & Gelpke, G. (1981). Comparison of discrimination techniques applied to a complex data set of head injured patients.Journal of the Royal Statistical Society, A Series,144, 145–175.
Upton, G. (1982). A comparison of alternative tests for the 2×2 comparative trial.Journal of the Royal Statistical Society, A Series,145, 86–105.
Utgoff, P. (1988). ID5: An incremental ID3.Proceedings of the Fifth International Conference on Machine Learning (pp. 107–120). Ann Arbor, MI: Morgan Kaufmann.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Mingers, J. An empirical comparison of selection measures for decision-tree induction. Mach Learn 3, 319–342 (1989). https://doi.org/10.1007/BF00116837
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00116837