Skip to main content

Annotations in Data Streams

  • Conference paper
Automata, Languages and Programming (ICALP 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5555))

Included in the following conference series:

Abstract

The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms be further reduced by enlisting a more powerful “helper” who can annotate the stream as it is read. We do not wish to blindly trust the helper, so we require that the algorithm be convinced of having computed a correct answer. We show upper bounds that achieve a non-trivial tradeoff between the amount of annotation used and the space required to verify it. We also prove lower bounds on such tradeoffs, often nearly matching the upper bounds, via notions related to Merlin-Arthur communication complexity. Our results cover the classic data stream problems of selection, frequency moments, and fundamental graph problems such as triangle-freeness and connectivity. Our work is also part of a growing trend — including recent studies of multi-pass streaming, read/write streams and randomly ordered streams — of asking more complexity-theoretic questions about data stream processing. It is a recognition that, in addition to practical relevance, the data stream model raises many interesting theoretical questions in its own right.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aaronson, S., Wigderson, A.: Algebrization: a new barrier in complexity theory. In: ACM STOC (2008)

    Google Scholar 

  2. Ablayev, F.: Lower bounds for one-way probabilistic communication complexity and their application to space complexity. Theoretical Computer Science 175(2), 139–159 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  3. Aggarwal, G., Datar, M., Rajagopalan, S., Ruhl, M.: On the streaming model augmented with a sorting primitive. In: IEEE FOCS (2004)

    Google Scholar 

  4. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58(1), 137–147 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  5. Babai, L., Frankl, P., Simon, J.: Complexity classes in communication complexity theory (preliminary version). In: IEEE FOCS (1986)

    Google Scholar 

  6. Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: ACM-SIAM SODA (2002)

    Google Scholar 

  7. Beame, P., Huynh-Ngoc, D.-T.: On the value of multiple read/write streams for approximating frequency moments. In: IEEE FOCS (2008)

    Google Scholar 

  8. Beame, P., Jayram, T.S., Rudra, A.: Lower bounds for randomized read/write stream algorithms. In: ACM STOC (2007)

    Google Scholar 

  9. Buriol, L.S., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: ACM PODS (2006)

    Google Scholar 

  10. Chakrabarti, A., Cormode, G., McGregor, A.: Robust lower bounds for communication and stream computation. In: ACM STOC (2008)

    Google Scholar 

  11. Chakrabarti, A., Khot, S., Sun, X.: Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In: IEEE CCC (2003)

    Google Scholar 

  12. Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. Theor. Comput. Sci. 312(1), 3–15 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  14. Demetrescu, C., Escoffier, B., Moruz, G., Ribichini, A.: Adapting parallel algorithms to the W-stream model, with applications to graph problems. In: Kučera, L., Kučera, A. (eds.) MFCS 2007. LNCS, vol. 4708, pp. 194–205. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  15. Demetrescu, C., Finocchi, I., Ribichini, A.: Trading off space for passes in graph streaming problems. In: ACM-SIAM SODA (2006)

    Google Scholar 

  16. Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Theoretical Computer Science 348(2-3), 207–216 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  17. Feigenbaum, J., Kannan, S., Zhang, J.: Annotation and computational geometry in the streaming model. Technical Report YALEU/DCS/TR-1249, Yale University (2003)

    Google Scholar 

  18. Freivalds, R.: Fast probabilistic algorithms. In: Becvar, J. (ed.) MFCS 1979. LNCS, vol. 74, Springer, Heidelberg (1979)

    Chapter  Google Scholar 

  19. Gertner, Y., Kannan, S., Viswanathan, M.: NP and streaming verifiers (manuscript, 2002)

    Google Scholar 

  20. Grohe, M., Hernich, A., Schweikardt, N.: Randomized computations on large data sets: tight lower bounds. In: ACM PODS (2006)

    Google Scholar 

  21. Henzinger, M.R., Raghavan, P., Rajagopalan, S.: Computing on data streams. In: External memory algorithms (1999)

    Google Scholar 

  22. Johnson, W., Lindenstrauss, J.: Extensions of Lipshitz mapping into Hilbert space. Contemporary Mathematics 26, 189–206 (1984)

    Article  MATH  Google Scholar 

  23. Jowhari, H., Ghodsi, M.: New streaming algorithms for counting triangles in graphs. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 710–716. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  24. Kimbrel, T., Sinha, R.K.: A probabilistic algorithm for verifying matrix products using o(n 2) time and log2 n + o(1) random bits. Inf. Process. Lett. 45(2), 107–110 (1993)

    Article  MATH  Google Scholar 

  25. Klauck, H.: Rectangle size bounds and threshold covers in communication complexity. In: IEEE CCC (2003)

    Google Scholar 

  26. Kushilevitz, E., Nisan, N.: Communication Complexity. CUP (1997)

    Google Scholar 

  27. Li, F., Yi, K., Hadjieleftheriou, M., Kollios, G.: Proof-infused streams: Enabling authentication of sliding window queries on streams. In: VLDB (2007)

    Google Scholar 

  28. Lund, C., Fortnow, L., Karloff, H., Nisan, N.: Algebraic methods for interactive proof systems. J. ACM 39(4), 859–868 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  29. Papadopoulos, S., Yang, Y., Papadias, D.: Cads: Continuous authentication on data streams. In: VLDB (2007)

    Google Scholar 

  30. Razborov, A.: On the distributional complexity of disjontness. In: Paterson, M. (ed.) ICALP 1990. LNCS, vol. 443, Springer, Heidelberg (1990)

    Google Scholar 

  31. Shamir, A.: IP = PSPACE. J. ACM 39(4), 869–877 (1992)

    Article  MathSciNet  Google Scholar 

  32. Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In: ACM-SIAM SODA (2004)

    Google Scholar 

  33. Tucker, P.A., Maier, D., Delcambre, L.M.L., Sheard, T., Widom, J., Jones, M.P.: Punctuated data streams (2005)

    Google Scholar 

  34. Yi, K., Li, F., Hadjieleftheriou, M., Kollios, G., Srivastava, D.: Randomized synopses for query assurance on data streams. In: IEEE ICDE (2008)

    Google Scholar 

  35. Zelke, M.: Weighted matching in the semi-streaming model. In: STACS, pp. 669–680 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chakrabarti, A., Cormode, G., McGregor, A. (2009). Annotations in Data Streams. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds) Automata, Languages and Programming. ICALP 2009. Lecture Notes in Computer Science, vol 5555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02927-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02927-1_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02926-4

  • Online ISBN: 978-3-642-02927-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics