Abstract
In this paper, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinformatics, e-learning systems, jobshop scheduling, and so on. A frequent superset means that it contains more transactions than minimum support threshold. Intuitively, according to the Apriori algorithm, the level-wise discovering starts from 1-itemset, 2-itemset, and so forth. However, such steps cannot utilize the property of Apriori to reduce search space, because if an itemset is not frequent, its superset maybe frequent. In order to solve this problem, we propose three methods. The first is the Apriori-based approach, called Apriori-C. The second is Eclat-based approach, called Eclat-C, which is depth-first approach. The last is the proposed data complement technique (DCT) that we utilize original frequent itemset mining approach to mine frequent superset. The experiment study compares the performance of the proposed three methods by considering the effect of the number of transactions, the average length of transactions, the number of different items, and minimum support.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference Santiago, Chile (1994)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. ACM SIGMOD (2000)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. ACM SIGKDD (1997)
Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A Tree Projection Algorithm For Generation of Frequent Itemsets. Journal on Parallel and Distributed Computing (Special Issue on High Performance Data Mining) 61(3), 350–371 (2000)
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Database. In: International Conference on Data Mining, ICDM (2001)
Park, J.S., Chen, M.-S., Yu, P.S.: An Effective Hash-Based Algorithm for Mining Association Rules. In: Proc. of ACM SIGMOD, pp. 175–186 (1995)
Bodon, F.: A Fast APRIORI implementation. Workshop on Frequent Itemset Mining Implementions. In: FIMI 2003 (2003)
Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: Depth First Generation of Long Patterns. In: Proc. of ACM SIGKDD, pp. 108–118 (2000)
Hipp, J., Guntzer, U., Nakhaeizadeh, G.: Algorithms for Association Rule Mining – A General Survey and Comparison. ACM SIGKDD Explorations 2(1), 58–64 (2000)
IBM Almaden Research Center, Intelligent Information System, http://www.almaden.ibm.com/software/quest/Resources/index.shtml
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, ZX., Shan, MK. (2004). Algorithms for Discovery of Frequent Superset, Rather Than Frequent Subset. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-30076-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive