Skip to main content

Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries

  • Conference paper
Intelligent Information Processing and Web Mining

Part of the book series: Advances in Soft Computing ((AINSC,volume 35))

Abstract

Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of batches of frequent itemset queries has been considered. The best technique for this problem proposed so far is Common Counting, which consists in concurrent processing of frequent itemset queries and integrating their database scans. Common Counting requires that data structures of several queries are stored in main memory at the same time. Since in practice memory is limited, the crucial problem is scheduling the queries to Common Counting phases so that the I/O cost is optimized. According to our previous studies, the best algorithm for this task, applicable to large batches of queries, is CCAgglomerative. In this paper we present a novel query scheduling method CCAgglomerativeNoise, built around CCAgglomerative, increasing its chances of finding an optimal solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1. Agrawal, R., Imielinski, T., Swami, A. (1993) Mining Association Rules Between Sets of Items in Large Databases. Proceedings of the 1993 ACM SIG-MOD Conference on Management of Data, Washington, D. C., 207–216

    Google Scholar 

  2. 2. Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, A., Bollinger, T. (1996) The Quest Data Mining System. Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, 244–249

    Google Scholar 

  3. 3. Agrawal, R,., Srikant, R,. (1994) Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 487–499

    Google Scholar 

  4. 4. Baralis, E., Psaila, G. (1999) Incremental Refinement of Mining Queries. Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, Florence, Italy, 173–182

    Google Scholar 

  5. 5. Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H. (2002) Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs. Journal of Artificial Intelligence Research 16, 135–166

    MATH  Google Scholar 

  6. 6. Cheung, D. W.-L., Han, J., Ng, V., Wong, C. Y. (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. Proceedings of the 12th International Conference on Data Engineering, NewOrleans, Louisiana, USA, 106–114

    Google Scholar 

  7. 7. Garey, M., Johnson, D., Stockmeyer, L. (1976) Some simplified NP-complete graph problems. Theoretical Computer Science 1(3), 237–267

    Article  MATH  MathSciNet  Google Scholar 

  8. 8. Hart, J.P., Shogan, A.W. (1987) Semi-greedy heuristics: An empirical study. Operations Research Letters 6, 107–114

    Article  MATH  MathSciNet  Google Scholar 

  9. 9. Imielinski, T., Mannila, H. (1996) A Database Perspective on Knowledge Discovery. Communications of the ACM 39(11), 58–64

    Article  Google Scholar 

  10. 10. Meo, R,. (2003) Optimization of a Language for Data Mining. Proceedings of the ACM Symposium on Applied Computing - Data Mining Track, Melbourne, Florida, USA, 437–444

    Google Scholar 

  11. 11. Morzy, M., Wojciechowski, M., Zakrzewicz, M. (2005) Optimizing a Sequence of Frequent Pattern Queries. Proceedings of the 7th International Conference on Data Warehousing and Knowledge Discovery, Copenhagen, Denmark, 448–457

    Google Scholar 

  12. 12. Nag, B., Deshpande, P. M., DeWitt, D. J. (1999) Using a Knowledge Cache for Interactive Discovery of Association Rules. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, San Diego, California, 244–253

    Google Scholar 

  13. 13. Sellis, T. (1988) Multiple-query optimization. ACM Transactions on Database Systems 13(1), 23–52

    Article  Google Scholar 

  14. 14. Wojciechowski, M., Zakrzewicz, M. (2003) Evaluation of Common Counting Method for Concurrent Data Mining Queries. Proceedings of 7th East European Conference on Advances in Databases and Information Systems, Dresden, Germany, 76–87

    Google Scholar 

  15. 15. Wojciechowski, M., Zakrzewicz, M. (2004) Evaluation of the Mine Merge Method for Data Mining Query Processing. Proceedings of the 8th East European Conference on Advances in Databases and Information Systems, Budapest, Hungary, 78–88

    Google Scholar 

  16. 16. Wojciechowski, M., Zakrzewicz, M. (2005) On Multiple Query Optimization in Data Mining. Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 696–701

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this paper

Cite this paper

Boinski, P., Jozwiak, K., Wojciechowski, M., Zakrzewicz, M. (2006). Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33521-8_23

Download citation

  • DOI: https://doi.org/10.1007/3-540-33521-8_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33520-7

  • Online ISBN: 978-3-540-33521-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics