Abstract
In association rule mining, when the attributes have numerical values the usual method employed in deterministic approaches is to discretize them defining proper intervals. But the type and parameters of the discretization can affect notably the quality of the rules generated. This work presents a method based on a deterministic exploration of the interval search space, with no use of a previous discretization but the dynamic generation of intervals. The algorithm also employs auxiliary data structures and certain optimizations to reduce the search and improve the quality of the rules extracted. Some experiments have been performed comparing it with the well known deterministic Apriori algorithm. Also, the algorithm has been used for the extraction of association rules from a dataset with information about Sub-Saharan African countries, obtaining a variety of good-quality rules.
This work was partially funded by the Spanish Ministry of Science and Innovation, the Spanish Government Plan E and the European Union through ERDF (TIN2009-14057-C03-03).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM SIGMOD ICMD, pp. 207–216. ACM Press, Washington (1993)
Borgelt, C.: Efficient Implementations of Apriori and Eclat. In: Workshop on Frequent Itemset Mining Implementations. CEUR Workshop Proc. 90, Florida, USA (2003)
Bodon, F.: A Trie-based APRIORI Implementation for Mining Frequent Item Sequences. In: 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, Chicago, Illinois, USA, pp. 56–65. ACM Press, New York (2005)
Srikant, R., Agrawal, R.: Mining Quantitative Association Rules in Large Relational Tables. In: Proc. of the ACM SIGMOD 1996, pp. 1–12 (1996)
Wijsen, J., Meersman, R.: On the Complexity of Mining Quantitative Association Rules. Data Mining and Knowledge Discovery 2, 263–281 (1998)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. In: Proc. of the ACM SIGMOD 1997, pp. 265–276 (1997)
Lee, C.-H.: A Hellinger-based Discretization Method for Numeric Attributes in Classification Learning. Knowledge-Based Systems 20(4), 419–425 (2007)
Tsai, C.-J., Lee, C.-I., Yang, W.-P.: A Discretization Algorithm Based on Class-Attribute Contingency Coefficient. Information Science 178(3), 714–731 (2008)
Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6(4), 393–423 (2002)
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Easterly, W., Levine, R.: Africa’s Growth Tragedy: Policies and Ethnic Divisions. Quarterly Journal of Economics 112(4), 1203–1250 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Domínguez-Olmedo, J.L., Mata, J., Pachón, V., Maña, M.J. (2011). A Deterministic Approach to Association Rule Mining without Attribute Discretization. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds) Digital Information Processing and Communications. ICDIPC 2011. Communications in Computer and Information Science, vol 188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22389-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-22389-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22388-4
Online ISBN: 978-3-642-22389-1
eBook Packages: Computer ScienceComputer Science (R0)