Skip to main content

Efficient Mining of High Utility Itemsets from Large Datasets

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Abstract

High utility itemsets mining extends frequent pattern mining to discover itemsets in a transaction database with utility values above a given threshold. However, mining high utility itemsets presents a greater challenge than frequent itemset mining, since high utility itemsets lack the anti-monotone property of frequent itemsets. Transaction Weighted Utility (TWU) proposed recently by researchers has anti-monotone property, but it is an overestimate of itemset utility and therefore leads to a larger search space. We propose an algorithm that uses TWU with pattern growth based on a compact utility pattern tree data structure. Our algorithm implements a parallel projection scheme to use disk storage when the main memory is inadequate for dealing with large datasets. Experimental evaluation shows that our algorithm is more efficient compared to previous algorithms and can mine larger datasets of both dense and sparse data containing long patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Database. In: ACM SIGMOD International Conference on Management of Data (1993)

    Google Scholar 

  2. Yao, H., Hamilton, H.J., Buzz, C.J.: A Foundational Approach to Mining Itemset Utilities from Databases. In: 4th SIAM International Conference on Data Mining. Florida USA (2004)

    Google Scholar 

  3. Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data & Knowledge Engineering 59(3), 603–626 (2006)

    Article  Google Scholar 

  4. Liu, Y., Liao, W.K., Choudhary, A.: A Fast High Utility Itemsets Mining Algorithm. In: 1st Workshop on Utility-Based Data Mining. Chicago Illinois (2005)

    Google Scholar 

  5. Erwin, A., Gopalan, R.P.: N.R. Achuthan.: CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach. In: IEEE CIT 2007. Aizu Wakamatsu, Japan (2007)

    Google Scholar 

  6. Han, J., Wang, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data (2000)

    Google Scholar 

  7. Erwin, A., Gopalan, R.P., Achuthan, N.R.: A Bottom-Up Projection Based Algorithm for Mining High Utility Itemsets. In: International Workshop on Integrating AI and Data Mining. Gold Coast, Australia (2007)

    Google Scholar 

  8. CUCIS. Center for Ultra-scale Computing and Information Security, Northwestern University, http://cucis.ece.northwestern.edu/projects/DMS/MineBenchDownload.html

  9. Yao, H., Hamilton, H.J., Geng, L.: A Unified Framework for Utility Based Measures for Mining Itemsets. In: ACM SIGKDD 2nd Workshop on Utility-Based Data Mining (2006)

    Google Scholar 

  10. Pei, J.: Pattern Growth Methods for Frequent Pattern Mining. Simon Fraser University (2002)

    Google Scholar 

  11. Sucahyo, Y.G., Gopalan, R.P.: CT-PRO: A Bottom-Up Non Recursive Frequent Itemset Mining Algorithm Using Compressed FP-Tree Data Structure. In: IEEE ICDM Workshop on Frequent Itemset Mining Implementation (FIMI). Brighton UK (2004)

    Google Scholar 

  12. FIMI, Frequent Itemset Mining Implementations Repository, http://fimi.cs.helsinki.fi/

  13. IBM Synthetic Data Generator, http://www.almaden.ibm.com/software/quest/resources/index.shtml

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Erwin, A., Gopalan, R.P., Achuthan, N.R. (2008). Efficient Mining of High Utility Itemsets from Large Datasets. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics