Abstract
Learning from streaming data has gained a lot of attention and interest in the past decades. These improvements have shown promising results when the models are trained and test on a single streaming source. However, the trained model often fail to produce the reliable results due to the difficulty of data shift and knowledge transfer with heterogeneous streaming domains. In this paper, we propose an architecture that is based on autoencoders. Specifically, we use online feature learning based on denoising autoencoder to learn more robust representations from streaming data. In order to tackle with data shift between source and target streaming data, we develop an ensemble weighted strategy, which can effectively handle the concept drifts of streaming data. Moreover, we develop the transfer mechanism, which is capable of transferring label information across heterogeneous domains. Finally, we combine online learning, data shift adaption and knowledge transfer with heterogeneous domains into a single process, which makes our proposed architecture powerful in learning and predicting for multistream classification problem. Experiments on heterogeneous datasets validate that the proposed algorithm can quickly and accurately classify instances on a stream together with a small number of labeled examples. Compared with a few related methods, our algorithm achieves some state-of-the-art results.
Similar content being viewed by others
References
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems, pp 153–160
Cesa-Bianchi N, Conconi A, Gentile C (2004) On the generalization ability of on-line learning algorithms. IEEE Trans Inf Theory 50(9):2050–2057
Chandra S, Haque A, Khan L, Aggarwal C (2016) An adaptive framework for multistream classification. In: ACM International on conference on information and knowledge management. ACM, pp 1181–1190
Chandra S, Haque A, Tao H, Liu J, Khan L, Aggarwal C (2018) Ensemble direct density ratio estimation for multistream classification. In: IEEE International conference on data engineering. IEEE, pp 1364–1367
Chechik G, Shalit U, Sharma V, Bengio S (2009) An online algorithm for large scale image similarity learning. In: Advances in neural information processing systems, pp 306–314
Chen M, Xu Z, Weinberger K, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. arXiv
Dong B, Gao Y, Chandra S, Khan L (2019) Multistream classification with relative density ratio estimation. In: AAAI, vol 33, pp 3478–3485
Gama J, Kosina P, et al. (2011) Learning decision rules from data streams. In: IJCAI, pp 1255–1262
Gillen S, Jung C, Kearns M, Roth A (2018) Online learning with an unknown fairness metric. In: Advances in neural information processing systems, pp 2605–2614
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: International conference on machine learning, pp 513–520
Gomes JB, Gaber MM, Sousa PA, Menasalvas E (2014) Mining recurring concepts in a dynamic feature space. IEEE Trans Neural Netw Learn Syst 25(1):95–110
Haque A, Chandra S, Khan L, Hamlen K, Aggarwal C (2017) Efficient multistream classification using direct density ratio estimation. In: IEEE International conference on data engineering. IEEE, pp 155–158
Haque A, Wang Z, Chandra S, Dong B, Khan L, Hamlen KW (2017) Fusion: an online method for multistream classification. In: ACM on conference on information and knowledge management. ACM, pp 919–928
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: International conference on machine learning, pp 473–480
Laskov P, Gehl C, Krüger S, Müller KR (2006) Incremental support vector learning: analysis, implementation and applications. J Mach Learn Res 7:1909–1936
Li Y, Wang Y, Liu Q, Bi C, Jiang X, Shurong S (2018) Incremental semi-supervised learning on streaming data. Pattern Recogn 152:200–214
Li YF, Gao Y, Ayoade G, Tao H, Khan L, Thuraisingham B (2019) Multistream classification for cyber threat data with heterogeneous feature space. In: The World Wide Web conference. ACM, pp 2992–2998
Luo Y, Liu T, Wen Y, Tao D (2018) Online heterogeneous transfer metric learning. In: IJCAI, pp 2525–2531
Mehrkanoon S, Agudelo OM, Suykens JA (2015) Incremental multi-class semi-supervised clustering regularized by Kalman filtering. Neural Netw 71:88–104
Pratama M, Anavatti SG, Angelov PP, Lughofer E (2014) Panfis: a novel incremental learning machine. IEEE Trans Neural Netw Learn Syst 25(1):55–68
Precup D, Pineau J, Barreto AS (2012) On-line reinforcement learning using incremental kernel-based stochastic factorization. In: Adv Neural Inform Process Syst, pp 1484–1492
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: explicit invariance during feature extraction. In: International conference on machine learning, pp 833–840
van Rijn JN, Holmes G, Pfahringer B, Vanschoren J (2015) Having a blast: meta-learning and heterogeneous ensembles for data streams. In: IEEE International conference on data mining. IEEE, pp 1003–1008
van Rijn JN, Holmes G, Pfahringer B, Vanschoren J (2018) The online performance estimation framework: heterogeneous ensemble learning for data streams. Mach Learn 107(1):149–176
Rodrigues PP, Gama J, Pedroso J (2008) Hierarchical clustering of time-series data streams. IEEE Trans Knowl Data Eng 20(5):615–627
Rosenfeld A, Tsotsos JK (2018) Incremental learning through deep adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Shi X, Liu Q, Fan W, Philip SY, Zhu R (2010) Transfer learning on heterogenous feature spaces via spectral transformation. In: IEEE International conference on data mining. IEEE, pp 1049–1054
Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943
Sonderby CK, Raiko T, Maaloe L, Sonderby SK, Winther O (2016) Ladder variational autoencoders. In: Advances in neural information processing systems, pp 3738–3746
Tang J, Shu X, Li Z, Qi GJ, Wang J (2016) Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Trans Multimed Comput Commun Appl 12(4s):68
Tu Z, Xie W, Qin Q, Poppe R, Veltkamp RC, Li B, Yuan J (2018) Multi-stream cnn: learning representations based on human-related regions for action recognition. Pattern Recogn 79:32–43
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: International conference on machine learning, pp 1096–1103
Wang Y, Fan X, Luo Z, Wang T, Min M, Luo J (2017) Fast online incremental learning on mixture streaming data. In: AAAI, pp 2739–2745
Wu H, Yan Y, Ye Y, Min H, Ng MK, Wu Q (2019) Online heterogeneous transfer learning by knowledge transition. ACM Trans Intell Syst Technol (TIST) 10(3):26
Wu Z, Jiang YG, Wang X, Ye H, Xue X (2016) Multi-stream multi-class fusion of deep networks for video classification. In: ACM International conference on multimedia. ACM, pp 791–800
Xiao F, Xie X, Jiang Z, Sun L, Wang R (2016) Utility-aware data transmission scheme for delay tolerant networks. Peer-to-Peer Network Appl 9(5):936–944
Xiao F, Yang X, Yang M, Sun L, Wang R, Yang P (2016) Surface coverage algorithm in directional sensor networks for three-dimensional complex terrains. Tsinghua Sci Technol 21(4):397–406
Xu B, Fu Y, Jiang YG, Li B, Sigal L (2018) Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans Affect Comput 9(2):255–270
Yan Y, Wu Q, Tan M, Ng MK, Min H, Tsang IW (2018) Online heterogeneous transfer by hedge ensemble of offline and online decisions. IEEE Trans Neural Netw Learn Syst 29(7):3252– 3263
Zeng XQ, Li GZ (2014) Incremental partial least squares analysis of big streaming data. Pattern Recogn 47(11):3726–3735
Zhang X, Yu FX, Chang SF, Wang S (2015) Deep transfer network: unsupervised domain adaptation. arXiv
Zhao L, Chen Z, Yang LT, Deen MJ, Wang ZJ (2019) Deep semantic mapping for heterogeneous multimedia transfer learning using co-occurrence data. ACM Trans Multimed Comput Commun Appl 15(1s):9
Zhao P, Hoi SC, Wang J, Li B (2014) Online transfer learning. Artif Intell 216:76–102
Zhou G, Sohn K, Lee H (2012) Online incremental feature learning with denoising autoencoders. In: International conference on artificial intelligence and statistics, pp 1453–1461
Zhou JT, Pan SJ, Tsang IW, Yan Y (2014) Hybrid heterogeneous transfer learning through deep learning. In: AAAI
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This article is sponsored by NUPTSF (Grant No. NY219149). This article has been awarded by the National Natural Science Foundation of China (61932013,61802185), the National Key Research and Development Program of China (2018YFB0803400), the Nature Science Foundation of Jiangsu for Distinguished Young Scientist (BK20170039,BK20180470).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, Y., Li, H. Online transferable representation with heterogeneous sources. Appl Intell 50, 1674–1686 (2020). https://doi.org/10.1007/s10489-019-01620-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01620-3