Cluster-Based Outlier Detection Using Unsupervised Extreme Learning Machines

Wang, Xite; Shen, Derong; Bai, Mei; Nie, Tiezheng; Kou, Yue; Yu, Ge

doi:10.1007/978-3-319-28397-5_11

Xite Wang⁷,
Derong Shen⁷,
Mei Bai⁷,
Tiezheng Nie⁷,
Yue Kou⁷ &
…
Ge Yu⁷

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 6))

1371 Accesses
2 Citations

Abstract

Outlier detection is an important data mining task, whose target is to find the abnormal or atypical objects from a given data set. The techniques for detecting outliers have a lot of applications, such as credit card fraud detection, environment monitoring, etc. In this paper, we proposed a new definition of outlier, called cluster-based outlier. Comparing with the existing definitions, the cluster-based outlier is more suitable for the complicated data sets that consist of many clusters with different densities. To detect cluster-based outliers, we first split the given data set into a number of clusters using unsupervised extreme learning machines. Then, we further design a pruning method technique to efficiently compute outliers in each cluster. at last, the effectiveness and efficiency of the proposed approaches are verified through plenty of simulation experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hawkins, D.M.: Identification of Outliers. Springer, New York (1980)
Book MATH Google Scholar
Barnett, V., Lewis, T.: Outliers in Statistical Data. Wiley, New York (1994)
MATH Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (2005)
Google Scholar
Knorr, E.M., Ng, R.T.: Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403 (1998)
Google Scholar
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. ACM SIGMOD Rec. 29(2), 427–438 (2000)
Article Google Scholar
Angiulli, F., Pizzuti, C.: Outlier mining in large high-dimensional data sets. IEEE Trans. Knowl. Data Eng. 17(2), 203–215 (2005)
Article MathSciNet Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. ACM Sigmod Rec. 29(2), 93–104 (2000)
Article Google Scholar
Huang, G., Song, S., Gupta, J.N.D., Wu, C.: Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern. 44(12), 2405–2417 (2014)
Article Google Scholar
Huang, G., Zhu, Q., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. Proc. Int. Joint Conf. Neural Netw. 2, 985–990 (2004)
Google Scholar
Huang, G., Zhu, Q., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Article Google Scholar
Huang, G.: What are extreme learning machines? Filling the gap between Frank Rosenblatt’s Dream and John von Neumann’s Puzzle. Cogn. Comput. 7(3), 263–278 (2015)
Article Google Scholar
Cherkassky, V.: The nature of statistical learning theory. IEEE Trans. Neural Netw. 8(6), 1564 (1997)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Huang, G., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B 42(2), 513–529 (2012)
Article Google Scholar
Rong, H., Huang, G., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans. Syst. Man Cybern. Part B 39(4), 1067–1072 (2009)
Article Google Scholar
Liang, N., Huang, G., Saratchandran, P., Sundararajan, N.: A Fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17(6), 1411–1423 (2006)
Article Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An Efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2001)
Article Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Andrew, Y., Ng, M.I., Jordan, Y.W.: On spectral clustering: analysis and an algorithm. Adv. Neural Inform. Process. Syst. 2, 849C856 (2002)
Google Scholar
Bengio, Yoshua: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recog. Lett. 24(9), 1641–1650 (2003)
Article MATH Google Scholar
Bay, S.D, Schwabacher, M.: Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 29–38 (2003)
Google Scholar
Angiulli, F., Fassetti, F.: Very efficient mining of distance-based outliers. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 791–800 (2007)
Google Scholar
Guttman, A.: R-trees: a dynamic index structure for spatial searching. ACM (1984)
Google Scholar
Patella, M., Ciaccia, P., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB). Athens, Greece (1997)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Basic Research 973 Program of China under Grant No.2012CB316201, the National Natural Science Foundation of China under Grant Nos. 61033007, 61472070.

Author information

Authors and Affiliations

College of Information Science & Engineering, Northeastern University, Shenyang, 110819, China
Xite Wang, Derong Shen, Mei Bai, Tiezheng Nie, Yue Kou & Ge Yu

Authors

Xite Wang
View author publications
You can also search for this author in PubMed Google Scholar
Derong Shen
View author publications
You can also search for this author in PubMed Google Scholar
Mei Bai
View author publications
You can also search for this author in PubMed Google Scholar
Tiezheng Nie
View author publications
You can also search for this author in PubMed Google Scholar
Yue Kou
View author publications
You can also search for this author in PubMed Google Scholar
Ge Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xite Wang .

Editor information

Editors and Affiliations

Institute of Information and Contro, Hangzhou Dianzi University, Zhejiang, China
Jiuwen Cao
Nanyang Technological University, Singapore, Singapore
Kezhi Mao
ECE, U of Windsor, WINDSOR, Ontario, Canada
Jonathan Wu
Dept of Mechanical and Industrial Engg, University of Iowa, Iowa City, Iowa, USA
Amaury Lendasse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Shen, D., Bai, M., Nie, T., Kou, Y., Yu, G. (2016). Cluster-Based Outlier Detection Using Unsupervised Extreme Learning Machines. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds) Proceedings of ELM-2015 Volume 1. Proceedings in Adaptation, Learning and Optimization, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-28397-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-28397-5_11
Published: 01 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28396-8
Online ISBN: 978-3-319-28397-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics