Skip to main content

Handling Concept Drift in Data Streams by Using Drift Detection Methods

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 839))

Abstract

The growth and development of the information and communication technology of the present era resulted in huge amount data generation. It is found that the rate of data distribution is very high. The data which is generated with varying distributions is referred to as data stream. Few examples to quote, data generated with regard to applications related to mobile networks, sensor networks, network traffic monitoring and network traffic management, etc. It is found that, the data generation process often change with respect to data distribution for any kind of concept, i.e. application which is referred to as concept drift. Handling concept drift is a challenging task. It is impossible to develop a model as it will be inconsistent in nature because of continuous change. The present work emphasises on handling the concept drifts, using different drift detection methods using Massive Online Analysis Framework. The important feature of the present study is varying size of a data stream (50,000–250,000). Totally the Concept Drift is handled using 11 drift detection methods using 2 stream generators abrupt and gradual under this frame work respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C. (2007). Data streams: Models and algorithms. Series: Advances in database systems (Vol. 31, XVIII, p. 354). Berlin, Heidelberg: Springer (ebook).

    Google Scholar 

  2. Gaber, M. M., Zaslavsky, A., & Krishnamurthy, S.(2005). Data streams: Models and methods (Vol. 31, pp. 39–59). Berlin, Heidelberg: Springer.

    Google Scholar 

  3. Gao, J., Fan, W., Han, J., & Yu, P. S. (2007). General framework for mining concept-drifting data streams with skewed distributions. In SIAM International Conference on Data Mining (pp. 3–14), Minneapolis.

    Google Scholar 

  4. Brzesinski, D. (2010). Mining data streams with concept drift (Ph.D. thesis).

    Google Scholar 

  5. Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. In The Proceedings of ICCBR-2003 Workshop on Long-Lived CBR System.

    Google Scholar 

  6. Kolter, J. Z., Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In 3rd IEEE International Conference on Data Mining ICDM-2003 (pp. 123–130). IEEE CS Press.

    Google Scholar 

  7. Kubat, M., & Widmer, G. (1994). Adapting to drift in continuous domains (Technical Report). Vienna: Austrian Research Institute for Artificial Intelligence.

    Google Scholar 

  8. Schlimmer, J. C., & Granger, R. H. (1986). Incremental learning from noisy data. Journal of Machine Learning, 1(3), 317–354.

    Google Scholar 

  9. Xiaofeng, L., & Weiwei, G. (2014). SERSC study on a classification model of data stream based on concept drift. International Journal of Multimedia and Ubiquitous Engineering, 9(5), 363–372. http://dx.doi.org/10.14257/ijmue.2014.9.5.37. ISSN: 1975-0080.

  10. Bifet, A., & Gavalda, R. Learning from time changing data with adaptive windowing. Univeritat Poliyecnica De Catalunya.

    Google Scholar 

  11. Bifet, A., Holmes, G., Kirby, R., & Pfahringer, B. (2010). MOA: Massive online analysis. Journal Machine Learning Research, 1601–1604.

    Google Scholar 

  12. Bifet, A., Holmes, G., Kirby, R., & Pfahringer, B. (2011). MOA: Massive online analysis. Journal of Machine Learning Research, 1601–1604.

    Google Scholar 

  13. Bifet, A., Kirkby, R., Kranen, P., & Reutemann, P. (2009). Massive online analysis, technical manual. Hamilton, New Zealand: University of Waikato.

    Google Scholar 

  14. Bifet, A., & Kirkby, R. (2009). Data stream mining: A practical approach (Technical report). New Zealand: The University of Waikato.

    Google Scholar 

  15. https://moa.cms.waikato.ac.nz/

  16. Bifet, A., & Gavaldà, R. (2009). Adaptive learning from evolving data streams. Advances in Intelligent Data Analysis, 8, 249–260. Berlin, Heidelberg: Springer.

    Google Scholar 

  17. Srimani, P. K., & Patil, M. M. (2012). Simple perceptron model (SPM) on evolving streams in MDM. International Journal of Neural Networks, 2(1), 20–24. E-ISSN 2249-2771.

    Google Scholar 

  18. Patil, M. M. (2015). A comprehensive study of recommender systems. International Journal of Advancements in Engineering Research, 10(86), 332–337.

    Google Scholar 

  19. Srimani, P. K., & Patil, M. M. (2015). Performance analysis of hoeffding trees in MDM using MOA framework. International Journal of Data Mining, Modeling and Management, 7(4), 293–313. http://dx.doi.org/10.1504/IJDMMM.2015.073865.

  20. Srimani, P. K., & Patil, M. M. (2015). Frequent item set mining using INC_MINE in massive on line analysis framework. Science Direct Journal of Elsevier Publication, 45, 133–142.

    Article  MathSciNet  MATH  Google Scholar 

  21. Srimani, P. K., & Patil, M. M. (2014). Regression modeling using IBLSTREAMS. Indian Journal of Science and Technology, 7(6), 864–870. Print ISSN 0974-6846, Online ISSN 0974-5645.

    Google Scholar 

  22. Srimani, P. K., & Patil, M. M. (2016). Mining data streams with concept drift in massive online analysis framework. WSEAS transactions on computers, 15(#14), 133–139. Article is in the press. ISSN/E-ISSN 1109-2750/2224-2872.

    Google Scholar 

  23. Barnard, G. A. (1959). Control charts and stochastic processes. Journal of the Royal Statistical Society. B (Methodological), 21(2), 239–271. JSTOR 2983801.

    Google Scholar 

  24. Grigg, Farewell, V. T., Spiegelhalter, D.J., et al. (2003). The use of risk-adjusted CUSUM and RSPRT charts for monitoring in medical contexts. Statistical Methods in Medical Research, 12(2): 147–170. https://doi.org/10.1177/096228020301200205.pmid12665208.

  25. Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41, 100–115.

    Google Scholar 

  26. Gama, J., & Bifet, A. (2014). A survey on concept drift adaptation. Portugal: University of Porto.

    Google Scholar 

  27. Gavaldμa, R., & Morales-Bueno, R.(2011). Early drift detection method manuel. Spain: University of Malaga.

    Article  MathSciNet  Google Scholar 

  28. Rossa, G. J., Adamsa, N. M., Tasoulisa, D. K., & Handaa, D. J. (2012). Exponentially weighted moving average charts for detecting concept drift. Department of Mathematics, Imperial College, London SW7 2AZ, UK.

    Google Scholar 

  29. Roberts, S. W. (2012). In M. Brama & M. Petridis (Eds.), Control chart tests based on geometric moving averages. Research Development in intelligent systems, XXIX, 97–101.https://doi.org/10.1007/978-1-4471-4739-8-6 .

  30. Frías-Blanco, I., Campo-Ávila, J. Ramos, G., Bueno, R., Díaz, A., & Caballero Mota, Y. (2015). Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering, 27(3), 810–823.

    Google Scholar 

  31. Moustakides, G. V. (2008). Sequential change detection revisited. The Annals of Statistics 2008, 36(2), 787–807. https://doi.org/10.1214/009053607000000938.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malini M. Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, M.M. (2019). Handling Concept Drift in Data Streams by Using Drift Detection Methods. In: Balas, V., Sharma, N., Chakrabarti, A. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 839. Springer, Singapore. https://doi.org/10.1007/978-981-13-1274-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1274-8_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1273-1

  • Online ISBN: 978-981-13-1274-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics