Skip to main content

Smooth Interpolating Histograms with Error Guarantees

  • Conference paper
Sharing Data, Information and Knowledge (BNCOD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5071))

Included in the following conference series:

Abstract

Accurate selectivity estimations are essential for query optimization decisions where they are typically derived from various kinds of histograms which condense value distributions into compact representations. The estimation accuracy of existing approaches typically varies across the domain, with some estimations being very accurate and some quite inaccurate. This is in particular unfortunate when performing a parametric search using these estimations, as the estimation artifacts can dominate the search results. We propose the usage of linear splines to construct histograms with known error guarantees across the whole continuous domain. These histograms are particularly well suited for using the estimates in parameter optimization. We show by a comprehensive performance evaluation using both synthetic and real world data that our approach clearly outperforms existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66(4), 614–656 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  2. Güntzer, U., Balke, W.T., Kießling, W.: Optimizing multi-feature queries for image databases. In: VLDB, pp. 419–428 (2000)

    Google Scholar 

  3. Nepal, S., Ramakrishna, M.V.: Query processing issues in image (multimedia) databases. In: ICDE, pp. 22–29 (1999)

    Google Scholar 

  4. Cao, P., Wang, Z.: Efficient top-k query calculation in distributed networks. In: PODC, pp. 206–215 (2004)

    Google Scholar 

  5. Yu, H., Li, H.G., Wu, P., Agrawal, D., Abbadi, A.E.: Efficient processing of distributed top-k queries. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 65–74. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Neumann, T., Michel, S.: Algebraic query optimization for distributed top-k queries. In: BTW, pp. 324–343 (2007)

    Google Scholar 

  7. Michel, S., Triantafillou, P., Weikum, G.: Klee: A framework for distributed top-k query algorithms. In: VLDB, pp. 637–648 (2005)

    Google Scholar 

  8. König, A.C., Weikum, G.: Combining histograms and parametric curve fitting for feedback-driven query result-size estimation. In: VLDB, pp. 423–434 (1999)

    Google Scholar 

  9. Ioannidis, Y.E.: The history of histograms (abridged). In: VLDB, pp. 19–30 (2003)

    Google Scholar 

  10. Deshpande, A., Garofalakis, M.N., Rastogi, R.: Independence is good: Dependency-based histogram synopses for high-dimensional data. In: SIGMOD, pp. 199–210 (2001)

    Google Scholar 

  11. Poosala, V., Ioannidis, Y.E., Haas, P.J., Shekita, E.J.: Improved histograms for selectivity estimation of range predicates. In: SIGMOD, pp. 294–305 (1996)

    Google Scholar 

  12. Jagadish, H.V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K.C., Suel, T.: Optimal histograms with quality guarantees. In: VLDB, pp. 275–286 (1998)

    Google Scholar 

  13. Garofalakis, M.N., Kumar, A.: Wavelet synopses for general error metrics. ACM Trans. Database Syst. 30(4), 888–928 (2005)

    Article  Google Scholar 

  14. Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: SIGMOD, pp. 448–459 (1998)

    Google Scholar 

  15. Goodrich, M.T.: Efficient piecewise-linear function approximation using the uniform metric. Discrete & Computational Geometry 14(4), 445–462 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  16. Scott, D.W.: Multivariate Density Estimation: Theory, practice, and visualization. Wiley, Chichester (1992)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alex Gray Keith Jeffery Jianhua Shao

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neumann, T., Michel, S. (2008). Smooth Interpolating Histograms with Error Guarantees. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70504-8_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70503-1

  • Online ISBN: 978-3-540-70504-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics