Skip to main content

Multi-Hierarchies: Accurately Computing Realtime Statistical Measures on Data Streams

  • Conference paper
Wireless Algorithms, Systems, and Applications (WASA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8491))

  • 2049 Accesses

Abstract

Computing statistical measures is a fundamental problem for mining data streams. Sometimes user wants to query the realtime correlation of data streams. In this paper, we introduce a system for computing realtime statistical measures of data streams. The system updates the realtime summaries which are used to compute affine relationships. We process every elements in every data stream only once, and get a similar accuracy rating compared with the static methods. To the best of our knowledge, we present a new method of computing affine relationship. Our system employs the multi-Hierarchies approach in the Sliding Window Model. First, we change AFCLST Clustering algorithm. Second, the Bottom-Up Updating algorithm updates the summaries which every hierarchy has stored after the Cumulative Calculation algorithms. Third, the Query Response algorithm uses summaries to compute the statistical measure. Finally, we establish the accuracy rating of our approach by performing several experiments on real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sathe, S., Aberer, K.: AFFINITY: Efficiently Querying Statistical Measures on Time-Series Data. In: ICDE 2006 (2013)

    Google Scholar 

  2. Zhu, Y., Shasha, D.: Statstream:Statistical monitoring of thousands of data streams in real time. In: VLDB, pp. 358–369 (2002)

    Google Scholar 

  3. Li, C.-S., Yu, P.S., Castelli, V.: HierarchyScan:A hierarchical similarity search algorithm for databases of long sequences. In: ICDE, pp. 546–553 (1996)

    Google Scholar 

  4. Cole, R., Shasha, D., Zhao, X.: Fast window correlations over uncooperative time series. In: SIGKDD, pp. 743–749 (2005)

    Google Scholar 

  5. Maronna, R., Martin, R., Yohai, V.: Robust statics. Wiley Series in Probability and Statistics (2006)

    Google Scholar 

  6. Golub, G., Van Loan, C.: Matrix computations. The Johns Hopkins University Press (1996)

    Google Scholar 

  7. Sathe, S., Aberer, K.: AFFINITY:Efficiently querying statistical measures on time-series data. EPFL. Tech. Rep. (2012), http://infoscience.epfl.cn/record/180121

  8. Bishop, C.: Pattern recognition and machine learning. Springer (2006)

    Google Scholar 

  9. Gehrke, J., Korn, F., Srivastava, D.: Oncomputing correlated aggregates over continual data streams. In: SIGMOD, pp. 13–24 (2001)

    Google Scholar 

  10. Ke, Y., Cheng, J., Ng, W.: Correlation search in graph databases. In: SIGKDD, pp. 390–399 (2007)

    Google Scholar 

  11. Agrawal, R., Lin, K., Sawhney, H., Shim, K.: Fast similarity search in the presence of noise,scaling and translation in time-series databses. In: VLDB (1995)

    Google Scholar 

  12. Reeves, G., Liu, J., Nath, S., Zhao, F.: Managing massive time series streams with multi-scale compressed trickles. In: VLDB, pp. 97–108 (2009)

    Google Scholar 

  13. Bulut, A., Singh, A.: SWAT: Hierarchical stream Summarization in Large Networks. In: Proc. of the 19th International Conference on Data Engineering, pp. 303–314 (2003)

    Google Scholar 

  14. Bulut, A., Ambuj, K., Singh, A.: A Unified Framework for Monitoring Data Stream in Real Time. In: Proc. of the 21st International Conference on Data Engineering, pp. 44–55 (2005)

    Google Scholar 

  15. Richard, A.J., Dean, W.W.: Applied Multivariate Statical Analysis, 6th edn. Prentice Hall, New York (2007)

    Google Scholar 

  16. Rodrigues, P.P., Gama, J., Pedroso, J.P.: ODAC: Hierarchical clustering of time series data streams. In: SIAM (2006)

    Google Scholar 

  17. Domingos, P., Hulten, C.: Mining high-speed data streams. In: Proc. of the KDD (2000), http://citeseer.ist.psu.edu/domingos00mining.html

  18. Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: SIGMOD, pp. 58–66 (2001)

    Google Scholar 

  19. Qiao, L., Agrawal, D., El Abbadi, A.: Rhist: adaptive summarization over continuous data streams. In: Proceeding of the Eleventh International Conference on Information and Knowledge Management, pp. 469–476 (2002)

    Google Scholar 

  20. Babcock, B., Datar, M., Motwani, R., Callaghan, L.: Maintaining covariance and k-medians over data stream windows. In: Proc. of the 22nd ACM SIGACT-SIGMOD-SIGART Symp., Principles of Database Systems, pp. 234–243 (2003)

    Google Scholar 

  21. Jagadish, H., Mendelzon, A.: Similarity-based queries for time series data. In: SIGMOD, pp. 13–25 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Qi, P., Shi, S. (2014). Multi-Hierarchies: Accurately Computing Realtime Statistical Measures on Data Streams. In: Cai, Z., Wang, C., Cheng, S., Wang, H., Gao, H. (eds) Wireless Algorithms, Systems, and Applications. WASA 2014. Lecture Notes in Computer Science, vol 8491. Springer, Cham. https://doi.org/10.1007/978-3-319-07782-6_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07782-6_65

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07781-9

  • Online ISBN: 978-3-319-07782-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics