Skip to main content

The General Procedure of Ensembles Construction in Data Stream Scenarios

  • Chapter
  • First Online:
Stream Data Mining: Algorithms and Their Probabilistic Properties

Part of the book series: Studies in Big Data ((SBD,volume 56))

  • 976 Accesses

Abstract

During constructing data stream algorithms the following three aspects have to be taken into consideration: accuracy, running time and required memory. However, in many cases, the fastest algorithms are less accurate than methods requiring high computational power and more time for data analysis. Therefore, to enhance the performance of the algorithms, which in data stream scenario must be characterized by low memory requirement and short time of learning, one can use an ensemble approach. Roughly speaking, the decision made by the ensemble of algorithms can be seen as a decision based on an opinion of a few specialists. In real life nobody is infallible, so to improve the decision making process people often take a final decision after consulting with a few various persons. The vivid example is the diagnosis of an illness. When someone gets bad news, he often goes to other doctors for a second, third, fourth opinion and so on, until we are sure about the diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Krawczyk, B., Schaefer, G., Wozniak, M.: A cost-sensitive ensemble classifier for breast cancer classification. In: 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 427–430 (2013)

    Google Scholar 

  2. Margoosian, A., Abouei, J.: Ensemble-based classifiers for cancer classification using human tumor microarray data. In: 2013 21st Iranian Conference on Electrical Engineering (ICEE), pp. 1–6 (2013)

    Google Scholar 

  3. Turhal, U., Babur, S., Avci, C., Akbas, A.: Performance improvement for diagnosis of colon cancer by using ensemble classification methods. In: 2013 International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), pp. 271–275 (2013)

    Google Scholar 

  4. Pan, S., Zhu, X., Zhang, C., Yu, P.S.: Graph stream classification using labeled and unlabeled graphs. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 398–409 (2013)

    Google Scholar 

  5. Yu, G., Rangwala, H., Domeniconi, C., Zhang, G., Yu, Z.: Protein function prediction using multi-label ensemble classification. IEEE/ACM Trans. Comput. Biol. Bioinformat. 10(4), 1–1 (2013)

    Article  Google Scholar 

  6. Chan, J.C.W., Demarchi, L., Van de Voorde, T., Canters, F.: Binary classification strategies for mapping urban land cover with ensemble classifiers. In: IEEE International on Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008, vol. 3, pp. II–1004–III–1007 (2008)

    Google Scholar 

  7. He, L., Kong, F., Shen, Z.: Artificial neural network ensemble for land cover classification. In: The Sixth World Congress on Intelligent Control and Automation, 2006. WCICA 2006, vol. 2, pp. 10054–10057 (2006)

    Google Scholar 

  8. Maragoudakis, M., Maglogiannis, I.: Skin lesion diagnosis from images using novel ensemble classification techniques. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–5 (2010)

    Google Scholar 

  9. Kotti, M., Paternò, F.: Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema. Int. J. Speech Technol. 15(2), 131–150 (2012)

    Article  Google Scholar 

  10. Zhang, B.: Reliable classification of vehicle types based on cascade classifier ensembles. IEEE Trans. Intell. Trans. Syst. 14(1), 322–332 (2013)

    Article  Google Scholar 

  11. Pietruczuk, L.: Application of Ensemble Algorithms for Data Stream Mining. Ph.D. thesis, Czestochowa University of Technology (2015)

    Google Scholar 

  12. Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)

    Google Scholar 

  13. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)

    Google Scholar 

  14. Polikar, R., Upda, L., Upda, S.S., Honavar, V.: Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans. Syst. Man Cybernet. Part C (Appl. Rev.) 31(4), 497–508 (2001)

    Google Scholar 

  15. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  16. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)

    Article  Google Scholar 

  17. Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345. IEEE (2005)

    Google Scholar 

  18. Beygelzimer, A., Kale, S., Luo, H.: Optimal and adaptive algorithms for online boosting. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2323–2331 (2015)

    Google Scholar 

  19. Minku, L.L., Yao, X.: DDD: A new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)

    Article  Google Scholar 

  20. Jaworski, M., Duda, P., Rutkowski, L., Najgebauer, P., Pawlak, M.: Heuristic regression function estimation methods for data streams with concept drift. In: Lecture Notes in Computer Science, pp. 726–737. Springer (2017)

    Google Scholar 

  21. Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–6. IEEE (2017)

    Google Scholar 

  22. Mahdi, O.A., Pardede, E., Cao, J.: Combination of information entropy and ensemble classification for detecting concept drift in data stream. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 13. ACM (2018)

    Google Scholar 

  23. Kolter, J., Maloof, M.A.: Using additive expert ensembles to cope with concept drift. In: Proceedings of the 22nd International Conference on Machine Learning. ACM (2005)

    Google Scholar 

  24. Kadlec, P., Gabrys, B.: Local learning-based adaptive soft sensor for catalyst activation prediction. AIChE J. 57(5), 1288–1301 (2011)

    Article  Google Scholar 

  25. Ikonomovska, E., Gama, J., Dzeroski, S.: Learning model trees from evolving data streams. Data Mining Knowl. Disc. 23(1), 128–168 (2011)

    Article  MathSciNet  Google Scholar 

  26. Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: Proceedings of the 28th International Conference on Machine Learning (2011)

    Google Scholar 

  27. Ikonomovska, E., Gama, J., Džeroski, S.: Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458–470 (2015)

    Article  Google Scholar 

  28. Duarte, J., Gama, J., Bifet, A.: Adaptive model rules from high-speed data streams. ACM Trans. Knowl. Disc. Data (TKDD) 10.3(30) (2016)

    Google Scholar 

  29. Xiao, H., Eckert, C.: Lazy Gaussian process committee for real-time online regression. AAAI (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leszek Rutkowski .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rutkowski, L., Jaworski, M., Duda, P. (2020). The General Procedure of Ensembles Construction in Data Stream Scenarios. In: Stream Data Mining: Algorithms and Their Probabilistic Properties. Studies in Big Data, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-030-13962-9_12

Download citation

Publish with us

Policies and ethics