Skip to main content

A New Incremental Algorithm for Induction of Multivariate Decision Trees for Large Datasets

  • Conference paper
Intelligent Data Engineering and Automated Learning – IDEAL 2008 (IDEAL 2008)

Abstract

Several algorithms for induction of decision trees have been developed to solve problems with large datasets, however some of them have spatial and/or runtime problems using the whole training sample for building the tree and others do not take into account the whole training set. In this paper, we introduce a new algorithm for inducing decision trees for large numerical datasets, called IIMDT, which builds the tree in an incremental way and therefore it is not necesary to keep in main memory the whole training set. A comparison between IIMDT and ICE, an algorithm for inducing decision trees for large datasets, is shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dunham, M.: Data Mining, Introductory and Advanced Topics. Prentice Hall, New Jersey (2003)

    Google Scholar 

  2. Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston (2006)

    Google Scholar 

  3. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  4. Pao, H.-K., Chang, S.-C., Lee, Y.-J.: Model trees for classification of hybrid data types. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 32–39. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Pérez, J., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., Martín, J.: Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters 28(4), 414–422 (2007)

    Article  Google Scholar 

  6. Utgoff, P.E.: An improved algorithm for incremental induction of decision trees. In: Proc. 11th International Conference on Machine Learning, pp. 318–325 (1994)

    Google Scholar 

  7. Pedrycz, W., Sosnowski: C-fuzzy decision trees. IEEE Transactions on Systems, Man and Cybernetics - Part C: Applications and reviews 35(4), 498–511 (2005)

    Article  Google Scholar 

  8. Agrawal, R., Imielinski, T., Swami, A.: Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering 5(6), 914–925 (1993)

    Article  Google Scholar 

  9. Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel classifier for data mining. In: Proc. 22nd International Conference Very Large Databases, pp. 544–555 (1996)

    Google Scholar 

  11. Alsabti, K., Ranka, S., Singh, V.: CLOUDS: A decision tree classifier for large datasets. In: Proc. Conference Knowledge Discovery and Data Mining (KDD 1998), pp. 2–8 (1998)

    Google Scholar 

  12. Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest - a framework for fast decision tree classification of large datasets. In: Proc. of VLDB Conference, New York, pp. 416–427 (1998)

    Google Scholar 

  13. Gehrke, J., Ganti, V., Ramakrishnan, R., Loh, W.: BOAT - optimistic decision tree construction. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 169–180 (1999)

    Google Scholar 

  14. Yoon, H., Alsabti, K., Ranka, S.: Tree-based incremental classification for large datasets. Technical Report TR-99-013, CISE Department, University of Florida, Gainesville, FL. 32611 (1999)

    Google Scholar 

  15. UCI machine learning repository, University of California (2007), http://www.ics.uci.edu/mlearn/MLRepository.html

  16. Adelman-McCarthy, J., Agueros, M.A., Allam, S.S.: Data Release 6, ApJS, 175 (in press, 2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G., Martínez-Trinidad, J.F. (2008). A New Incremental Algorithm for Induction of Multivariate Decision Trees for Large Datasets. In: Fyfe, C., Kim, D., Lee, SY., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2008. IDEAL 2008. Lecture Notes in Computer Science, vol 5326. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88906-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88906-9_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88905-2

  • Online ISBN: 978-3-540-88906-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics