Skip to main content

Tight Lower Bounds for Query Processing on Streaming and External Memory Data

  • Conference paper
Automata, Languages and Programming (ICALP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3580))

Included in the following conference series:

Abstract

We study a clean machine model for external memory and stream processing. We show that the number of scans of the external data induces a strict hierarchy (as long as work space is sufficiently small, e.g., polylogarithmic in the size of the input). We also show that neither joins nor sorting are feasible if the product of the number r(n) of scans of the external memory and the size s(n) of the internal memory buffers is sufficiently small, e.g., of size \(o(\sqrt[n]{5})\). We also establish tight bounds for the complexity of XPath evaluation and filtering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)

    MATH  Google Scholar 

  2. Aggarwal, G., Datar, M., Rajagopalan, S., Ruhl, M.: On the streaming model augmented with a sorting primitive. In: Proc. FOCS 2004, pp. 540–549 (2004)

    Google Scholar 

  3. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58, 137–147 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  4. Arasu, A., Babcock, B., Green, T., Gupta, A., Widom, J.: Characterizing Memory Requirements for Queries over Continuous Data Streams. In: Proc. PODS 2002, pp. 221–232 (2002)

    Google Scholar 

  5. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. PODS 2002, pp. 1–16 (2002)

    Google Scholar 

  6. Bar-Yossef, Z., Fontoura, M., Josifovski, V.: On the Memory Requirements of XPath Evaluation over XML Streams. In: Proc. PODS 2004, pp. 177–188 (2004)

    Google Scholar 

  7. Brüggemann-Klein, A., Murata, M., Wood, D.: Regular Tree and Regular Hedge Languages over Non-ranked Alphabets: Version 1, April 3 (2001), Technical Report HKUST-TCSC-2001-05, Hong Kong Univ. of Science and Technology (2001)

    Google Scholar 

  8. Chen, J.-E., Yap, C.-K.: Reversal Complexity. SIAM J. Comput. 20(4), 622–638 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  9. Doner, J.: Tree Acceptors and some of their Applications. Journal of Computer and System Sciences 4, 406–451 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  10. Duris, P., Galil, Z., Schnitger, G.: Lower bounds on communication complexity. Information and Computation 73, 1–22 (1987); Journal version of STOC 1984 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  11. Gottlob, G., Koch, C.: Monadic Datalog and the Expressive Power of Web Information Extraction Languages. Journal of the ACM 51(1), 74–113 (2004)

    Article  MathSciNet  Google Scholar 

  12. Gottlob, G., Koch, C., Pichler, R.: Efficient Algorithms for Processing XPath Queries. In: Proc. VLDB 2002, Hong Kong, China, pp. 95–106 (2002)

    Google Scholar 

  13. Gottlob, G., Koch, C., Pichler, R.: The Complexity of XPath Query Evaluation. In: Proc. PODS 2003, San Diego, California, pp. 179–190 (2003)

    Google Scholar 

  14. Graefe, G.: Query Evaluation Techniques for Large Databases. ACM Computing Surveys 25(2), 73–170 (1993)

    Article  Google Scholar 

  15. Green, T.J., Miklau, G., Onizuka, M., Suciu, D.: Processing XML Streams with Deterministic Automata. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 173–189. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Grohe, M., Koch, C., Schweikardt, N.: Tight lower bounds for query processing on streaming and external memory data. Technical report CoRR cs.DB/0505002, Full version of ICALP 2005 paper (2005)

    Google Scholar 

  17. Grohe, M., Schweikardt, N.: Lower bounds for sorting with few random accesses to external memory. In: Proc. PODS (2005) (To appear)

    Google Scholar 

  18. Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on data streams. In: External memory algorithms. DIMACS Series In Discrete Mathematics And Theoretical Computer Science, vol. 50, pp. 107–118 (1999)

    Google Scholar 

  19. Hopcroft, J.E., Ullman, J.D.: Some results on tape-bounded Turing machines. Journal of the ACM 16(1), 168–177 (1969)

    Article  MATH  MathSciNet  Google Scholar 

  20. Koch, C.: Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach. In: Proc. VLDB 2003, pp. 249–260 (2003)

    Google Scholar 

  21. Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge Univ. Press, Cambridge (1997)

    MATH  Google Scholar 

  22. Meyer, U., Sanders, P., Sibeyn, J. (eds.): ESA 2003. LNCS, vol. 2832. Springer, Heidelberg (2003)

    Google Scholar 

  23. Munro, J., Paterson, M.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  24. Muthukrishnan, S.: Data streams: algorithms and applications. In: Proc. 14th SODA, pp. 413–413 (2003)

    Google Scholar 

  25. Neumann, A., Seidl, H.: Locating Matches of Tree Patterns in Forests. In: Arvind, V., Sarukkai, S. (eds.) FST TCS 1998. LNCS, vol. 1530, pp. 134–146. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  26. Neven, F., van den Bussche, J.: Expressiveness of Structured Document Query Languages Based on Attribute Grammars. J. ACM 49(1), 56–100 (2002)

    Article  MathSciNet  Google Scholar 

  27. Ramakrishnan, R., Gehrke, J.: Database Management Systems. McGraw-Hill, New York (2002)

    Google Scholar 

  28. Segoufin, L.: Typing and Querying XML Documents: Some Complexity Bounds. In: Proc. PODS 2003, pp. 167–178 (2003)

    Google Scholar 

  29. Segoufin, L., Vianu, V.: Validating Streaming XML Documents. In: Proc. PODS 2002 (2002)

    Google Scholar 

  30. Thatcher, J., Wright, J.: Generalized Finite Automata Theory with an Application to a Decision Problem of Second-order Logic. Math. Syst. Theory 2(1), 57–81 (1968)

    Article  MathSciNet  Google Scholar 

  31. van Emde Boas, P.: Machine Models and Simulations. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, ch. 1, vol. 1, pp. 1–66. Elsevier Science Publishers B.V, Amsterdam (1990)

    Google Scholar 

  32. Vitter, J.: External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys 33(2), 209–271 (2001)

    Article  Google Scholar 

  33. World Wide Web Consortium. XQuery 1.0 and XPath 2.0 Formal Semantics. W3C Working Draft (August 16, 2002), http://www.w3.org/XML/Query

  34. Yao, A.: Some complexity questions related to distributive computing. In: Proc. 11th STOC, pp. 209–213 (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grohe, M., Koch, C., Schweikardt, N. (2005). Tight Lower Bounds for Query Processing on Streaming and External Memory Data. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds) Automata, Languages and Programming. ICALP 2005. Lecture Notes in Computer Science, vol 3580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11523468_87

Download citation

  • DOI: https://doi.org/10.1007/11523468_87

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27580-0

  • Online ISBN: 978-3-540-31691-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics