Skip to main content

The Star Schema Benchmark and Augmented Fact Table Indexing

  • Conference paper
Performance Evaluation and Benchmarking (TPCTC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5895))

Included in the following conference series:

Abstract

We provide a benchmark measuring star schema queries retrieving data from a fact table with Where clause column restrictions on dimension tables. Clustering is crucial to performance with modern disk technology, since retrievals with filter factors down to 0.0005 are now performed most efficiently by sequential table search rather than by indexed access. DB2’s Multi-Dimensional Clustering (MDC) provides methods to "dice" the fact table along a number of orthogonal "dimensions", but only when these dimensions are columns in the fact table. The diced cells cluster fact rows on several of these "dimensions" at once so queries restricting several such columns can access crucially localized data, with much faster query response. Unfortunately, columns of dimension tables of a star schema are not usually represented in the fact table. In this paper, we show a simple way to adjoin physical copies of dimension columns to the fact table, dicing data to effectively cluster query retrieval, and explain how such dicing can be achieved on database products other than DB2. We provide benchmark measurements to show successful use of this methodology on three commercial database products.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bhattacharjee, B., et al.: Efficient Query Processing for Multi-Dimensional Clustered Tables in DB2

    Google Scholar 

  2. Chen, X., O’Neil, P., O’Neil, E.: Adjoined Dimension Column Clustering to Improve Data Warehouse Query Performance. Poster presentation at ICDE 2008 (2008), http://www.cs.umb.edu/~poneil/icde08_adc.pdf

  3. Cranston, L.: MDC Performance: Customer Examples and Experiences, http://www.research.ibm.com/mdc/db2.pdf

  4. IBM: Designing Multidimensional Clustering (MDC) Tables, http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/c0007238.htm

  5. IBM Research: DB2’s Multi-Dimensional Clustering, http://www.research.ibm.com/mdc/

  6. Kennedy, J.: Introduction to Multidimensional Clustering with DB2 UDB LUW. In: IBM DB2 Information Management Technical Conference, Orlando, FL (September 2005)

    Google Scholar 

  7. Kimball, R., Ross, M.: The Data Warehouse Toolkit, 2nd edn. Wiley, Chichester (2002)

    Google Scholar 

  8. Lightstone, S., Teorey, T., Nadeau, T.: Physical Database Design. Morgan Kaufman, San Francisco (2007)

    Google Scholar 

  9. O’Neil, P.: The Set Query Benchmark. In: Gray, J. (ed.) The Benchmark Handbook for Database and Transaction Processing Systems, pp. 209–245. Morgan Kauffmann, San Francisco (1993), http://www.cs.umb.edu/~poneil/SetQBM.pdf

    Google Scholar 

  10. O’Neil, P., O’Neil, E.: Database Principles, Programming, and Performance, 2nd edn. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  11. O’Neil, P., O’Neil, E., Chen, X.: The Star Schema Benchmark, http://www.cs.umb.edu/~poneil/StarSchemaB.pdf

  12. Oracle: Partitioning in Oracle Database 10g Release 2 (May 2005), http://www.oracle.com/solutions/business_intelligence/partitioning.html

  13. Padmanabhan, S., et al.: Multi-Dimensional Clustering: A New Data Layout Scheme in DB2. In: Proceedings of the ACM SIGMOD Conference (2003)

    Google Scholar 

  14. Selinger, P., et al.: Access Path Selection in a Relational Database Management System. In: Proceedings of the ACM SIGMOD Conference, pp. 23–34 (1979)

    Google Scholar 

  15. Stonebraker, M., et al.: One Size Fits All? Part 2: Benchmarking Results, Keynote address. In: Proceedings of CIDR (2007), http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p20.pdf

  16. TPC: TPC-DS, TPC Decision Support, under development, http://www.tpc.org/tpcds/default.asp

  17. Nambiar, R., Poess, M.: The Making of TPC-DS. In: VLDB Proceedings (2006)

    Google Scholar 

  18. TPC: TPC-H Version 2.4.0, http://www.tpc.org/tpch/default.asp

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

O’Neil, P., O’Neil, E., Chen, X., Revilak, S. (2009). The Star Schema Benchmark and Augmented Fact Table Indexing. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2009. Lecture Notes in Computer Science, vol 5895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10424-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10424-4_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10423-7

  • Online ISBN: 978-3-642-10424-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics