Abstract
The family of instance-based learning algorithms have been shown to be effective for learning classification schemes in many domains. However, it demands high data retention rate and is sensitive to noise. We investigate an integration of instance-filtering and instance-averaging techniques to solve the problem. We compare different variants of integration as well as existing learning algorithms such as C4.5 and KNN. Our new framework shows good performance in data reduction while maintaining or even improving classification accuracy in 19 real data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aha, D. W. (1992). Tolerating Noisy, Irrelevant, and Novel Attributes in Instance-Based Learning Algorithms. International Journal of ManMachine Studies, 36:267–287.
Aha, D. W. and Kibler, D. (1989). Noise-Tolerant Instance-Based Learning Algorithms. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 794–799.
Aha, D. W., Kibler, D. and Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning, 6:37–66.
Bezdek, J. C., Reichherzer, T. R., Lim, G. S. and Attikiouzel, Y. (1998). Multiple-Prototype Classifier Design. IEEE Transactions on Systems, Man, and Cyberneics, 28 (1):67–79.
Blake, C.L. and Merz, C.J. (1998). UCI Repository of Machine Learning Database. Irvine, CA: University of California lrvine, Department of Information and Computer Science. http://www.ics.uci.eduh~mlearn/MLRepository.html.
Bradshaw, G. (1987). Learning about Speech Sounds: The NEXUS project. Proceedings of the Fourth International Workshop on Machine Learning, pages 1–11.
Cameron-Jones, R. M. (1992). Minimum Description Length InstanceBased Learning. Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, pages 368–373.
Cameron-Jones, R. M. (1995). Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing. Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, pages 293–301.
Chang, C. L. (1974). Finding Prototypes for Nearest Neighbor Classifiers. IEEE Transactions on Computers, 23(11):1179–1184.
Cost, S and Salzberg, S. (1993). A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Feature. Machine Learning, 10:57–78.
Dasarathy, B. V. (1990). Nearest Neighbor (NN) Norms: NN Pattern Classification Technique. IEEE Computer Society Press.
Dasarathy, B. V. (1994). Minimal Consistent Set (MCS) Identification for Optimal Nearest Neighbor Decision Systems Design. IEEE Transactions on Systems, Man, and Cyberneics, 24(3):511–517.
Datta, P. and Kibler, D. (1997). Learning Symbolic Prototypes. Proceedings of the Fourteenth International Conference on Machine Learning, pages 75–82.
Datta, P. and Kibler, D. (1997). Symbolic Nearest Mean Classifier. Proceedings of the Fourteenth National Conference of Artificial Intelligence, pages 82–87.
Datta, P. and Kibler, D. (1995). Learning Prototypical Concept Description. Proceedings of the Twelfth International Conference on Machine Learning, pages 158–166.
Gates, G. W. (1972). The Reduced Nearest Neighbor Rule. IEEE Transactions on Information Theory, 18(3):431–433.
Gowda, K. C. and Krisha, G. (1979). The Condensed Nearest Neighbor Rule Using the Concept of Mutual Nearest Neighborhood. IEEE Transactions on Information Theory, 25(4):488–490.
Hart, P. E. (1968). The Condensed Nearest Neighbor Rule. IEEE Transactions on Information Theory, 14(3):515–516.
Kibler, D. and Aha, D. W. (1988). Comparing Instance-Averaging with Instance-Filtering Learning Algorithms. Proceedings of the Third European Working Session on Learning, pages 63–80.
Kuncheva, L. I., Bezdek, J. C. (1998). Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search? IEEE Transactions on Systems, Man, and Cyberneics, 28 (1):160–164.
Ricci, F. and Avesani, P. (1999). Date Compression and Local Metrics for Nearest Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(4):380–384.
Ritter, G. L, Woodruff, H. B. and Lowry, S. R. (1975). An Algorithm for a Selective Nearest Neighbor Decision Rule. IEEE Transactions on Information Theory, 21(6):665–669.
Salzberg, S. (1991). A Nearest Hyperrectangle Learning Method. Machine Learning, 6:251–276.
Sebestyen, G. S. (1962). Decision-Making Process in Pattern Recognition. New York: The Macmillan Company.
Skalak, D. B. (1994). Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms. Proceedings of the Eleventh International Conference on Machine Learning, pages 293– 301.
Swonger, C. W. (1972). Sample Set Condensation for a Condensed Nearest Neighbor Decision Rule for Pattern Recognition. In Watanabe, S., editor, Frontiers of Pattern Recognition. Academic Press, New York, NY, pages 511–519.
Tomek, I. (1976). An Experiment with the Edited Nearest—Neighbor Rule. IEEE Transactions on Systems, Man, and Cyberneics, 6(6):448– 452.
Trri, H., Knotkanen, P. and Myllymäki, P. (1996). Probabilistic InstanceBased Learning. Proceedings of the Thirteenth International Conference on Machine Learning, pages 158–166.
Ullmann, J. R. (1974). Automatic Selection of Reference Data for Use in a Nearest Neighbor Method of Pattern Classification. IEEE Transactions on Information Theory, 20(4):431–433.
Wettschereck, D. (1994). A Hybrid Nearest—Neighbor and Nearest—Hyperrectangle Algorithm. Proceedings of the Seventh European Conference on Machine Learning, pages 323–335.
Wettschereck, D., Aha, D. W. and Mohri, T. (1997). A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms. Artificial Intelligence Review, 11:273–314.
Wettschereck, D. and Dietterich, T. G. (1995). An Experimental Comparison of the Nearest—Neighbor and Nearest—Hyperrectangle Algorithms. Machine Learning, 19:5–27.
Wilson, D. L. (1972). Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cyberneics, 2:431–433.
Wilson, D. R. and Martinez T. R. (1997). Instance Pruning Techniques. Proceedings of the Fourteenth International Conference on Machine Learning, pages 403–411.
Zhang, J. (1992). Selecting Typical Instances in Instance-Based Learning. Proceedings of International Conference on Machine Learning, pages 470–479.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Lam, W., Keung, CK., Ling, C.X. (2001). Learning via Prototype Generation and Filtering. In: Liu, H., Motoda, H. (eds) Instance Selection and Construction for Data Mining. The Springer International Series in Engineering and Computer Science, vol 608. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-3359-4_13
Download citation
DOI: https://doi.org/10.1007/978-1-4757-3359-4_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-4861-8
Online ISBN: 978-1-4757-3359-4
eBook Packages: Springer Book Archive