Skip to main content

List Representation Applied to Sparse Datacubes for Data Warehousing and Data Mining

  • Conference paper
Intelligent Data Engineering and Automated Learning (IDEAL 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2690))

  • 1265 Accesses

Abstract

Typically 80% of the data in the logical OLAP datacube, the core engine of data warehouses, are zero. When it comes to sparse, the performance quickly degrades due to the heavy I/O overheads in sorting and merging intermediate results. In this work, we first introduce a list representation in main memory for storing and computing datasets. The sparse transaction dataset is compressed as the empty cells are removed Accordingly we propose a new algorithm for association rule mining on the platform of list representation, which just needs to scan the transaction database once to generate all the possible rules. In contrast, the well-known apriori algorithm requires repeated scans of the databases, thereby resulting in heavy I/O accesses particularly when considering large candidate datasets. In our opinion, this new algorithm using list representation economizes storage space and accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Colliat, G.: OLAP, relational and multidimensional database system. SIGMOD Record 25, 64–69 (1996)

    Article  Google Scholar 

  2. Shaffer, C.A.: Data Structures and Algorithm Analysis. Prentice-Hall, Englewood Cliffs (2001)

    Google Scholar 

  3. Pugh, W.: Skip Lists: A Probabilistic Alternative to Balanced Trees. Communications of the ACM 33(6), 668–676 (1990)

    Article  MathSciNet  Google Scholar 

  4. Borgelt, C., Kruse, R.: Induction of Association Rules: Apriori Implementation. In: Accepted to the 14th Conference on Computational Statistics, Compstat 2002, Berlin, Germany (2002)

    Google Scholar 

  5. Agrawal, R., Imielienski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. Conf. on Management of Data, pp. 207–216. ACM Press, New York (1993)

    Google Scholar 

  6. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. Conf. on the Management of Data (SIGMOD 2000, Dallas, TX). ACM Press, New York (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, F., Marir, F., Gordon, J., Helian, N. (2003). List Representation Applied to Sparse Datacubes for Data Warehousing and Data Mining. In: Liu, J., Cheung, Ym., Yin, H. (eds) Intelligent Data Engineering and Automated Learning. IDEAL 2003. Lecture Notes in Computer Science, vol 2690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45080-1_121

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45080-1_121

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40550-4

  • Online ISBN: 978-3-540-45080-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics