Abstract
Recent e-commerce and location-based services provide personalized recommendations based on machine-learning models that take into account purchase and visiting histories. Because machine-learning models assume the same distributions between training and test data, they cannot catch up with concept drifts, i.e., changes of behavioral patterns over time. To keep recommendation accurate, it is important to detect concept drifts. Generally, to achieve this, we need complete data (i.e., data without missing values). In real-world datasets, however, there are many incomplete data, and existing concept drift detection techniques do not deal with incomplete data. To address this issue, we investigate how a deep learning technique (denoising autoencoder), which complements missing values, contributes to detecting concept drifts in incomplete data. We conduct experiments on synthetic and real datasets to evaluate the robustness of this technique, and our results show its advantages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA, pp. 1027–1035 (2007)
Barros, R.S., Cabral, D.R., Gonçalves, P.M., Jr., Santos, S.G.: RDDM: reactive drift detection method. Expert Syst. Appl. 90, 344–355 (2017)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Boracchi, G., Carrera, D., Cervellera, C., Maccio, D.: Quanttree: histograms for change detection in multivariate data streams. In: ICML, pp. 639–648 (2018)
Boulanouar, S., Lamiche, C.: A new hybrid image segmentation method based on fuzzy c-mean and modified bat algorithm. Int. J. Comput. Digit. Syst. 9(4), 677–687 (2020)
Box, G.E., Hunter, W.H., Hunter, S.: Statistics for Experimenters, vol. 664 (1978)
Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Symposium on the Interface of Statistics, Computing Science, and Applications (2006)
Friedman, J.H., Rafsky, L.C.: Multivariate generalizations of the wald-wolfowitz and smirnov two-sample tests. Ann. Stat. 7, 697–717 (1979)
Gondara, L., Wang, K.: MIDA: multiple imputation using denoising autoencoders. In: PAKDD, pp. 260–272 (2018)
Haug, J., Kasneci, G.: Learning parameter distributions to detect concept drift in data streams. arXiv preprint arXiv:2010.09388 (2020)
Liu, A., Lu, J., Zhang, G.: Concept drift detection: dealing with missing values via fuzzy distance estimations. IEEE Trans. Fuzzy Syst. 29, 3219–3233 (2020)
Liu, A., Lu, J., Zhang, G.: Concept drift detection via equal intensity k-means space partitioning. IEEE Trans. Cybern. 51, 3198–3211 (2020)
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)
Lyu, Y., et al.: Behavior matching between different domains based on canonical correlation analysis. In: ECNLP, pp. 361–366 (2019)
Nguyen, D., et al.: On the transferability of deep neural networks for recommender system. In: IAL, pp. 22–37 (2020)
Shao, J., Ahmadi, Z., Kramer, S.: Prototype-based learning on concept-drifting data streams. In: KDD, pp. 412–421 (2014)
Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)
Sun, Z., Guo, Q., Yang, J., Fang, H., Guo, G., Zhang, J., Burke, R.: Research commentary on recommendations with side information: A survey and research directions. Electron. Commer. Res. Appl. 37, 100879 (2019)
Wang, H., et al.: Preliminary investigation of alleviating user cold-start problem in e-commerce with deep cross-domain recommender system. In: ECNLP, pp. 398–403 (2019)
Wang, H., et al.: A DNN-based cross-domain recommender system for alleviating cold-start problem in e-commerce. IEEE Open J. Ind. Electron. Soc. 1, 194–206 (2020)
Wang, S., Schlobach, S., Klein, M.: Concept drift and how to identify it. J. Web Semant. 9(3), 247–265 (2011)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Xu, S., Wang, J.: Dynamic extreme learning machine for data stream classification. Neurocomputing 238, 433–449 (2017)
Yonekawa, K., et al.: Advertiser-assisted behavioral ad-targeting via denoised distribution induction. In: IEEE Big Data, pp. 5611–5619 (2019)
Yonekawa, K.,et al.: A heterogeneous domain adversarial neural network for trans-domain behavioral targeting. In: DLKT, pp. 274–285 (2019)
Zhang, Y., et al.: Personalized geographical influence modeling for poi recommendation. IEEE Intell. Syst. 35(5), 18–27 (2020)
Acknowledgements
This research is supported by JST CREST Grant Number JPMJCR21F2.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Murao, J., Yonekawa, K., Kurokawa, M., Amagata, D., Maekawa, T., Hara, T. (2022). Concept Drift Detection with Denoising Autoencoder in Incomplete Data. In: Hara, T., Yamaguchi, H. (eds) Mobile and Ubiquitous Systems: Computing, Networking and Services. MobiQuitous 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 419. Springer, Cham. https://doi.org/10.1007/978-3-030-94822-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-94822-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94821-4
Online ISBN: 978-3-030-94822-1
eBook Packages: Computer ScienceComputer Science (R0)