Skip to main content

Abstract

We consider the problem of identifying periodic trends in data streams. We say a signal \({\mathbf a} \in \ensuremath{\mathbb{R}} ^n\) is p-periodic if a i  = a i + p for all i ∈ [n − p]. Recently, Ergün et al. [4] presented a one-pass, O(polylog n)-space algorithm for identifying the smallest period of a signal. Their algorithm required \({\mathbf a}\) to be presented in the time-series model, i.e., a i is the ith element in the stream. We present a more general linear sketch algorithm that has the advantages of being applicable to a) the turnstile stream model, where coordinates can be incremented/decremented in an arbitrary fashion and b) the parallel or distributed setting where the signal is distributed over multiple locations/machines. We also present sketches for (1 + ε) approximating the ℓ2 distance between \({\mathbf a}\) and the nearest p-periodic signal for a given p. Our algorithm uses O(ε − 2 polylog n) space, comparing favorably to an earlier time-series result that used \(O(\epsilon^{-5.5} \sqrt{p} polylog n)\) space for estimating the Hamming distance to the nearest p-periodic signal. Our last periodicity result is an algorithm for estimating the periodicity of a sequence in the presence of noise. We conclude with a small-space algorithm for identifying when two signals are exact (or nearly) cyclic shifts of one another. Our algorithms are based on bilinear sketches [10] and combining Fourier transforms with stream processing techniques such as ℓ p sampling and sketching [13, 11].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  2. Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. J. Algorithms 55, 58–75 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Czumaj, A., Gąsieniec, L.: On the complexity of determining the period of a string. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 412–422. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Ergün, F., Jowhari, H., Saglam, M.: Periodicity in streams. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX 2010, LNCS, vol. 6302, pp. 545–559. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Ergün, F., Muthukrishnan, S., Sahinalp, S.C.: Periodicity testing with sublinear samples and space. ACM Transactions on Algorithms 6(2) (2010)

    Google Scholar 

  6. Gilbert, A.C., Guha, S., Indyk, P., Muthukrishnan, S., Strauss, M.: Near-optimal sparse fourier representations via sampling. In: STOC, pp. 152–161 (2002)

    Google Scholar 

  7. Hardy, G.H., Wright, E.M.: An Introduction to The Theory of Numbers (Fourth Edition). Oxford University Press, Oxford (1960)

    MATH  Google Scholar 

  8. Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM 53(3), 307–323 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying representative trends in massive time series data sets using sketches. In: VLDB, pp. 363–372 (2000)

    Google Scholar 

  10. Indyk, P., McGregor, A.: Declaring independence via the sketching of sketches. In: SODA, pp. 737–745 (2008)

    Google Scholar 

  11. Kane, D.M., Nelson, J., Woodruff, D.P.: On the exact space complexity of sketching and streaming small norms. In: SODA, pp. 1161–1178 (2010)

    Google Scholar 

  12. Kane, D.M., Nelson, J., Woodruff, D.P.: An optimal algorithm for the distinct elements problem. In: PODS, pp. 41–52 (2010)

    Google Scholar 

  13. Monemizadeh, M., Woodruff, D.P.: 1-pass relative-error \(\text{L}_p\)-sampling with applications. In: SODA (2010)

    Google Scholar 

  14. Muthukrishnan, S.: Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science 1(2) (2005)

    Google Scholar 

  15. Nisan, N.: Pseudorandom generators for space-bounded computation. Combinatorica 12, 449–461 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  16. Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: FOCS, pp. 315–323 (2009)

    Google Scholar 

  17. Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26(5), 1484–1509 (1997)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crouch, M.S., McGregor, A. (2011). Periodicity and Cyclic Shifts via Linear Sketches. In: Goldberg, L.A., Jansen, K., Ravi, R., Rolim, J.D.P. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2011 2011. Lecture Notes in Computer Science, vol 6845. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22935-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22935-0_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22934-3

  • Online ISBN: 978-3-642-22935-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics