RWS (Random Walk Splitting): A Random Walk Based Discretization of Continuous Attributes

Hanaoka, Masaaki; Kobayashi, Masaki; Yamazaki, Haruaki

doi:10.1007/3-540-44533-1_18

Masaaki Hanaoka³,
Masaki Kobayashi³ &
Haruaki Yamazaki³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1886))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

956 Accesses
1 Citations

Abstract

The discretization of continuous attributes in a given training set is an important issue, which significantly affects the performance of decision trees. This paper proposes a method to discretize the continuous attributes based on a random walk modeled statistical test. In this method, the algorithm tries to find the point which divides the training set T into two groups T ₁ and T ₂ such that T = T ₁ ∪ T ₂ with possibly many instances from a majority class included in T ₁. In other words, the algorithm detects the splitting point, which gives the maximum discrepancy between the two empirical distributions, the majority class and the rest. The algorithm recursively executes this procedure until some statistical criterion is satisfied. Further, we report the effectiveness of the algorithm over ChiMerge and MDLPC based on an experiment with UCI repository.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blake, C., Keogh, E., and Merz, C.J.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html, Irvine, CA: University of California, Department of Information and Computer Science (1998)
Google Scholar
Catlett, J.: On Changing Continuous Attributes into Ordered Discrete Attributes, Proceedings of the European Working Session on Learning (1991) 164–178
Google Scholar
Dougherty, J., Kohavi, R., and Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features, Proceedings of the 12th International Conference on Machine Learning (1995) 194–202
Google Scholar
Fayyad, U.M. and Irani, K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning, Proceedings of the 13th International Joint Conference on Artificial Intelligence (1993) 1022–1027
Google Scholar
Feller, W.: An Introduction to Probability Theory and Its Applications Vol.2, First Edition, John Wiley & Sons, New York (1966)
MATH Google Scholar
Kerber, R.: ChiMerge: Discretization of Numeric Attributes, Proceedings of the 10th National Conference on Artificial Intelligence (1992) 123–128
Google Scholar
Quinlan, J.R.: Induction of Decision Trees, Machine Learning, Vol.1 (1986) 81–106
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning, Morgan Kaufmann, San Ma-teo, CA (1993)
Google Scholar
Rissanen, J.: Modeling by Shortest Data Description, Automatica, Vol.14 (1978) 465–471
Article MATH Google Scholar
Russell, S.J. and Norvig, P.: Artificial Intelligence A Modern Approach, Prentice-Hall (1995)
Google Scholar
Schaffer, C: Selecting a Classification Method by Cross-Validation, Machine Learning, Vol.13, No.l (1993) 135–143
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering, Yamanashi University, 4-3-11 Takeda, Kofu, Yamanashi, Japan
Masaaki Hanaoka, Masaki Kobayashi & Haruaki Yamazaki

Authors

Masaaki Hanaoka
View author publications
You can also search for this author in PubMed Google Scholar
Masaki Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Haruaki Yamazaki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan
Riichiro Mizoguchi
Computer Sciences Laboratory, Research School of Information Sciences and Engineering, Australian National University, Canberra, ACT, 0200, Australia
John Slaney

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hanaoka, M., Kobayashi, M., Yamazaki, H. (2000). RWS (Random Walk Splitting): A Random Walk Based Discretization of Continuous Attributes. In: Mizoguchi, R., Slaney, J. (eds) PRICAI 2000 Topics in Artificial Intelligence. PRICAI 2000. Lecture Notes in Computer Science(), vol 1886. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44533-1_18

Download citation

DOI: https://doi.org/10.1007/3-540-44533-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67925-7
Online ISBN: 978-3-540-44533-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics