Skip to main content

Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams

  • Conference paper
New Frontiers in Applied Artificial Intelligence (IEA/AIE 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5027))

Abstract

In this paper, we propose to combine the naive-Bayes approach with CVFDT, which is known as one of the major algorithms to induce a high-accuracy decision tree from time-changing data streams. The proposed improvement, called CVFDTNBC, induces a decision tree as CVFDT does, but contains naive-Bayes classifiers in the leaf nodes of the induced decision tree. The experiment using the artificially generated time-changing data streams shows that CVFDTNBC can induce a decision tree with more accuracy than CVFDT does.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifiers under Zero-One Loss. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  2. Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  3. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202 (1995)

    Google Scholar 

  4. Gama, J., Rocha, R., Medas, P.: Accurate Decision Trees for Mining High-speed Data Streams. In: Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528 (2003)

    Google Scholar 

  5. Gama, J., Medas, P., Rodrigues, P.: Learning Decision Trees from Dynamic Data Streams. In: Proceedings of the 2005 ACM Symposium on Applied computing, pp. 573–577 (2005)

    Google Scholar 

  6. Han, J., Kamber, M.: Data Mining: Concepts and Techiniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  7. Hulten, G., Spencer, L., Domingos, P.: Mining Time-changing Data Stream. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 97–106 (2001)

    Google Scholar 

  8. Hulten, G., Domingos, P.: VFML – A Toolkit for Mining High-speed Time-changing Data Streams (2003), http://www.cs.washington.edu/dm/vfml/

  9. Kubat, M., Widmer, G.: Adapting to Drift in Continuous Domains. In: Proceedings of the Eighth European Conference on Machine Learning, pp. 307–310 (1995)

    Google Scholar 

  10. Klinkenberg, R., Joachims, T.: Detecting Concept Drift with Support Vector Machines. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 487–494 (2000)

    Google Scholar 

  11. Kohavi, R.: Scaling Up the Accuracy of Naive- Bayes Classifiers: a Decision-Tree Hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)

    Google Scholar 

  12. Kohavi, R., Sahami, M.: Error-Based and Entropy-Based Discretization of Continuous Features. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 114–119 (1996)

    Google Scholar 

  13. Langley, P., Iba, W., Thompson, K.: An Analysis of Bayesian Classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223–228 (1992)

    Google Scholar 

  14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  15. Quinlan, J.R.: Improved Use of Continuous Attributes in C4.5. Journal of Artificial Intelligence Research 4, 77–90 (1996)

    MATH  Google Scholar 

  16. Widmer, G., Kubat, M.: Learning in the Presence of Concept Drift and Hidden Contexts. Machine Learning 23, 69–101 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ngoc Thanh Nguyen Leszek Borzemski Adam Grzech Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nishimura, S., Terabe, M., Hashimoto, K., Mihara, K. (2008). Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams. In: Nguyen, N.T., Borzemski, L., Grzech, A., Ali, M. (eds) New Frontiers in Applied Artificial Intelligence. IEA/AIE 2008. Lecture Notes in Computer Science(), vol 5027. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69052-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69052-8_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69045-0

  • Online ISBN: 978-3-540-69052-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics