Abstract
The problem of finding a suitable dataset to test different data mining algorithms and techniques and specifically association rule mining for Market Basket Analysis is a big challenge. A lot of dataset generators have been implemented in order to overcome this problem. ARtool is a tool that generates synthetic datasets and runs association rule mining for Market Basket Analysis. But the lack of datasets that include timestamps of the transactions to facilitate the analysis of Market Basket data taking into account temporal aspects is notable. In this paper, we present the TARtool. The TARtool is a data mining and generation tool based on the ARtool. TARtool is able to generate datasets with timestamps for both retail and e-commerce environments taking into account general customer buying habits in such environments. We implemented the generator to produce datasets with different format to ease the process of mining such datasets in other data mining tools. An advanced GUI is also provided. The experimental results showed that our tool overcomes other tools in efficiency, usability, functionality, and quality of generated data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yang, Q., Wu, X.: 10 Challenging Problems in Data Mining Research. International Journal of Information Technology and Decision Making 5, 597–604 (2006)
Antunes, C., Oliveira, A.: Temporal Data Mining: An Overview. In: Proceedings of the Workshop on Temporal Data Mining, of Knowledge Discovery and Data Mining (KDD 2001), San Francisco, USA (2001)
Lin, W., Orgun, M.A., Williams, G.: An Overview of Temporal Data Mining. In: Proceedings of the 1st Australian Data Mining Workshop, Canberra, Australia, pp. 83–90. University of Technology, Sydney (2002)
Ozden, B., Ramaswamy, S., Silberschatz, A.: Cyclic Association Rules. In: ICDE 1998: Proceedings of the Fourteenth International Conference on Data Engineering, Washington, DC, USA, pp. 412–421. IEEE Computer Society, Los Alamitos (1998)
Cristofor, L.: ARMiner Project, University of Massachusetts, Boston (last called in 15.09.2007), http://www.cs.umb.edu/~laur/ARMiner/
Cristofor, L.: ARtool Project, University of Massachusetts, Boston (last called in 15.09.2007), http://www.cs.umb.edu/~laur/ARtool/
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco (2005)
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. In: Provost, F., Srikant, R. (eds.) Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 401–406 (2001)
Hahsler, M., Hornik, K.: New Probabilistic Interest Measures for Association Rules. Intelligent Data Analysis, vol. 11, pp. 437–455 (2007)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, Santiago, Chile, pp. 487–499 (1994)
Groblschegg, M.: Developing a Testdata Generator for Market Basket Analysis for E-commerce Applications. Vienna University of Economics and Business Administration, Vienna, Austria (2003)
Ehrenberg, A.: Repeat-Buying: Facts, Theory and Applications. Charles Griffin & Company Ltd., London (1988)
Brijs, T.: Retail Market Basket Data Set, University of Limburg, Belgium (last called in 12.9.2007), http://fimi.cs.helsinki.fi/dat/retail.pdf
Neto, H., Almeida, J., Rocha, L., Meira, W., Guerra, P., Almeida, V.: A Characterization of Broadband User Behaviour and Their E-Business Activities. ACM SIGMETRICS Performance Evaluation Review, Special Issue: E-Commerce, vol. 32, pp. 3–13 (2004)
Vallamsetty, U., Kant, K., Mohapatra, P.: Characterization of E-commerce Traffic. In: Proceedings International Workshop on Advanced Issues of E-Commerce and Web Based Information Systems, pp. 137–147. IEEE Computer Society, Los Alamitos (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Omari, A., Langer, R., Conrad, S. (2008). TARtool: A Temporal Dataset Generator for Market Basket Analysis. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2008. Lecture Notes in Computer Science(), vol 5139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88192-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-88192-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88191-9
Online ISBN: 978-3-540-88192-6
eBook Packages: Computer ScienceComputer Science (R0)