Abstract
An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from the data set. The outliers are defined as the data transactions which contain less frequent patterns in their itemsets. We define a measure called FPOF (Frequent Pattern Outlier Factor) to detect the outlier transactions and propose the FindFPOF algorithm to discover outliers. The experimental results show that our approach outperformed the existing methods on identifying interesting outliers.
The High Technology Research and Development Program of China (No. 2002AA413310, No. 2003AA4Z2170, No. 2003AA413021) and the IBM SUR Research Fund supported this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-Based Outliers in Large Datasets. In: VLDB 1998 (1998)
Ramaswamy, S., et al.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD 2000 (2000)
Breunig, M.M., et al.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD 2000 (2000)
Aggarwal, C., Yu, P.: Outlier Detection for High Dimensional Data. In: SIGMOD 2001 (2001)
Wei, L., Qian, W., Zhou, A., Jin, W., Yu, J.X.: HOT: Hypergraph-Based Outlier Test for Categorical Data. In: PAKDD 2003 (2003)
Harkins, S., He, H., Willams, G.J., Baster, R.A.: Outlier Detection Using Replicator Neural Networks. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 170–180. Springer, Heidelberg (2002)
He, Z., Xu, X., Deng, S.: Discovering Cluster Based Local Outliers. Pattern Recognition Letters (2003)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: VLDB 1994 (1994)
Han, J., Pei, J., Yin, J.: Mining Frequent Patterns without Candidate Generation. In: SIGMOD 2000 (2000)
Merz, G., Murphy, P.: Uci repository of machine learning databases, http://www.ics.uci.edu/mlearn/MLRepository.html (1996)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD 1998 (1998)
Hawkins, D.: Identification of outliers. Chapman and Hall, London (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, Z., Xu, X., Huang, J.Z., Deng, S. (2004). A Frequent Pattern Discovery Method for Outlier Detection. In: Li, Q., Wang, G., Feng, L. (eds) Advances in Web-Age Information Management. WAIM 2004. Lecture Notes in Computer Science, vol 3129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27772-9_80
Download citation
DOI: https://doi.org/10.1007/978-3-540-27772-9_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22418-1
Online ISBN: 978-3-540-27772-9
eBook Packages: Springer Book Archive