Abstract
The growth and development of the information and communication technology of the present era resulted in huge amount data generation. It is found that the rate of data distribution is very high. The data which is generated with varying distributions is referred to as data stream. Few examples to quote, data generated with regard to applications related to mobile networks, sensor networks, network traffic monitoring and network traffic management, etc. It is found that, the data generation process often change with respect to data distribution for any kind of concept, i.e. application which is referred to as concept drift. Handling concept drift is a challenging task. It is impossible to develop a model as it will be inconsistent in nature because of continuous change. The present work emphasises on handling the concept drifts, using different drift detection methods using Massive Online Analysis Framework. The important feature of the present study is varying size of a data stream (50,000–250,000). Totally the Concept Drift is handled using 11 drift detection methods using 2 stream generators abrupt and gradual under this frame work respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C. (2007). Data streams: Models and algorithms. Series: Advances in database systems (Vol. 31, XVIII, p. 354). Berlin, Heidelberg: Springer (ebook).
Gaber, M. M., Zaslavsky, A., & Krishnamurthy, S.(2005). Data streams: Models and methods (Vol. 31, pp. 39–59). Berlin, Heidelberg: Springer.
Gao, J., Fan, W., Han, J., & Yu, P. S. (2007). General framework for mining concept-drifting data streams with skewed distributions. In SIAM International Conference on Data Mining (pp. 3–14), Minneapolis.
Brzesinski, D. (2010). Mining data streams with concept drift (Ph.D. thesis).
Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. In The Proceedings of ICCBR-2003 Workshop on Long-Lived CBR System.
Kolter, J. Z., Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In 3rd IEEE International Conference on Data Mining ICDM-2003 (pp. 123–130). IEEE CS Press.
Kubat, M., & Widmer, G. (1994). Adapting to drift in continuous domains (Technical Report). Vienna: Austrian Research Institute for Artificial Intelligence.
Schlimmer, J. C., & Granger, R. H. (1986). Incremental learning from noisy data. Journal of Machine Learning, 1(3), 317–354.
Xiaofeng, L., & Weiwei, G. (2014). SERSC study on a classification model of data stream based on concept drift. International Journal of Multimedia and Ubiquitous Engineering, 9(5), 363–372. http://dx.doi.org/10.14257/ijmue.2014.9.5.37. ISSN: 1975-0080.
Bifet, A., & Gavalda, R. Learning from time changing data with adaptive windowing. Univeritat Poliyecnica De Catalunya.
Bifet, A., Holmes, G., Kirby, R., & Pfahringer, B. (2010). MOA: Massive online analysis. Journal Machine Learning Research, 1601–1604.
Bifet, A., Holmes, G., Kirby, R., & Pfahringer, B. (2011). MOA: Massive online analysis. Journal of Machine Learning Research, 1601–1604.
Bifet, A., Kirkby, R., Kranen, P., & Reutemann, P. (2009). Massive online analysis, technical manual. Hamilton, New Zealand: University of Waikato.
Bifet, A., & Kirkby, R. (2009). Data stream mining: A practical approach (Technical report). New Zealand: The University of Waikato.
Bifet, A., & Gavaldà, R. (2009). Adaptive learning from evolving data streams. Advances in Intelligent Data Analysis, 8, 249–260. Berlin, Heidelberg: Springer.
Srimani, P. K., & Patil, M. M. (2012). Simple perceptron model (SPM) on evolving streams in MDM. International Journal of Neural Networks, 2(1), 20–24. E-ISSN 2249-2771.
Patil, M. M. (2015). A comprehensive study of recommender systems. International Journal of Advancements in Engineering Research, 10(86), 332–337.
Srimani, P. K., & Patil, M. M. (2015). Performance analysis of hoeffding trees in MDM using MOA framework. International Journal of Data Mining, Modeling and Management, 7(4), 293–313. http://dx.doi.org/10.1504/IJDMMM.2015.073865.
Srimani, P. K., & Patil, M. M. (2015). Frequent item set mining using INC_MINE in massive on line analysis framework. Science Direct Journal of Elsevier Publication, 45, 133–142.
Srimani, P. K., & Patil, M. M. (2014). Regression modeling using IBLSTREAMS. Indian Journal of Science and Technology, 7(6), 864–870. Print ISSN 0974-6846, Online ISSN 0974-5645.
Srimani, P. K., & Patil, M. M. (2016). Mining data streams with concept drift in massive online analysis framework. WSEAS transactions on computers, 15(#14), 133–139. Article is in the press. ISSN/E-ISSN 1109-2750/2224-2872.
Barnard, G. A. (1959). Control charts and stochastic processes. Journal of the Royal Statistical Society. B (Methodological), 21(2), 239–271. JSTOR 2983801.
Grigg, Farewell, V. T., Spiegelhalter, D.J., et al. (2003). The use of risk-adjusted CUSUM and RSPRT charts for monitoring in medical contexts. Statistical Methods in Medical Research, 12(2): 147–170. https://doi.org/10.1177/096228020301200205.pmid12665208.
Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41, 100–115.
Gama, J., & Bifet, A. (2014). A survey on concept drift adaptation. Portugal: University of Porto.
Gavaldμa, R., & Morales-Bueno, R.(2011). Early drift detection method manuel. Spain: University of Malaga.
Rossa, G. J., Adamsa, N. M., Tasoulisa, D. K., & Handaa, D. J. (2012). Exponentially weighted moving average charts for detecting concept drift. Department of Mathematics, Imperial College, London SW7 2AZ, UK.
Roberts, S. W. (2012). In M. Brama & M. Petridis (Eds.), Control chart tests based on geometric moving averages. Research Development in intelligent systems, XXIX, 97–101.https://doi.org/10.1007/978-1-4471-4739-8-6 .
Frías-Blanco, I., Campo-Ávila, J. Ramos, G., Bueno, R., Díaz, A., & Caballero Mota, Y. (2015). Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering, 27(3), 810–823.
Moustakides, G. V. (2008). Sequential change detection revisited. The Annals of Statistics 2008, 36(2), 787–807. https://doi.org/10.1214/009053607000000938.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patil, M.M. (2019). Handling Concept Drift in Data Streams by Using Drift Detection Methods. In: Balas, V., Sharma, N., Chakrabarti, A. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 839. Springer, Singapore. https://doi.org/10.1007/978-981-13-1274-8_12
Download citation
DOI: https://doi.org/10.1007/978-981-13-1274-8_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1273-1
Online ISBN: 978-981-13-1274-8
eBook Packages: EngineeringEngineering (R0)