Skip to main content

SIRCS: Slope-intercept-residual Compression by Correlation Sequencing for Multi-stream High Variation Data

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11446))

Included in the following conference series:

  • 3472 Accesses

Abstract

Multi-stream data with high variation is ubiquitous in the modern network systems. With the development of telecommunication technologies, robust data compression techniques are urged to be developed. In this paper, we humbly introduce a novel technique specifically for high variation signal data: SIRCS, which applies linear regression model for slope, intercept and residual decomposition of the multi data stream and combines the advanced tree mapping techniques. SIRCS inherits the advantages from the existing grouping compression algorithms, like GAMPS. With the newly invented correlation sorting techniques: the correlation tree mapping, SIRCS can practically improve the compression ratio by 13% from the traditional clustering mapping scheme. The application of the linear model decomposition can further facilitate the improvement of the algorithm performance from the state-of-art algorithms, with the RMSE decrease 4% and the compression time dramatically drop compared to the GAMPS. With the wide range of the error tolerance from 1% to 27%, SIRCS performs consistently better than all evaluated state-of-art algorithms regarding compression efficiency and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Buragohain, C., Shrivastava, N., Suri, S.: Space efficient streaming algorithms for the maximum error histogram, pp. 1026–1035. IEEE (2007)

    Google Scholar 

  2. Chakrabarti, K., Keogh, E., Mehrotra, S., Pazzani, M.: Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. (TODS) 27(2), 188–228 (2002)

    Article  Google Scholar 

  3. Chen, F., Deng, P., Wan, J., Zhang, D., Vasilakos, A.V., Rong, X.: Data mining for the Internet of Things: literature review and challenges. Int. J. Distrib. Sens. Netw. 11(8) (2015)

    Google Scholar 

  4. Cleary, J., Witten, I.: Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. 32(4), 396–402 (1984)

    Article  Google Scholar 

  5. Dang, T., Bulusu, N., Feng, W.: Robust data compression for irregular wireless sensor networks using logical mapping. Sens. Netw. 2013, 18 (2013)

    Google Scholar 

  6. Elmeleegy, H., Elmagarmid, A.K., Cecchet, E., Aref, W.G., Zwaenepoel, W.: Online piece-wise linear approximation of numerical streams with precision guarantees. Proc. VLDB Endowment 2(1), 145–156 (2009)

    Article  Google Scholar 

  7. Gandhi, S., Nath, S., Suri, S., Liu, J.: GAMPS: compressing multi sensor data by grouping and amplitude scaling. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, pp. 771–784. ACM (2009)

    Google Scholar 

  8. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)

    Article  Google Scholar 

  9. Korn, F., Jagadish, H., Faloutsos, C.: Efficiently supporting ad hoc queries in large datasets of time sequences, vol. 26, pp. 289–300 (1997). http://search.proquest.com/docview/26522991/

  10. Krause, A., Guestrin, C., Gupta, A., Kleinberg, J.: Near-optimal sensor placements: maximizing information while minimizing communication cost, vol. 2006, pp. 2–10. IEEE (2006)

    Google Scholar 

  11. Lazaridis, I., Mehrotra, S.: Capturing sensor-generated time series with quality guarantees (2003). http://handle.dtic.mil/100.2/ADA465863

  12. Louchard, G., Szpankowski, W.: On the average redundancy rate of the Lempel-Ziv code. IEEE Trans. Inf. Theory 43(1), 2–8 (1997)

    Article  MathSciNet  Google Scholar 

  13. McAnlis, C., Haecky, A.: Understanding Compression Data Compression for Modern Developers, 1st edn. O’Reilly Media, Sebastopol (2016)

    Google Scholar 

  14. Mochizuki, T.: WSJ.D technology: artificial intelligence gets a shake – tiny Japanese startup presses for gains in ‘deep learning’ efforts; a tech boon for Japan? Wall Street J. (2015). http://search.proquest.com/docview/1738468090/

  15. Rafiei, D., Mendelzon, A.: Similarity-based queries for time series data, vol. 26, pp. 13–25 (1997). http://search.proquest.com/docview/23040591/

  16. Sarlabous, L., Torres, A., Fiz, J.A., Morera, J., Jané, R.: Index for estimation of muscle force from mechanomyography based on the Lempel-Ziv algorithm. J. Electromyogr. Kinesiol. 23(3), 548–547 (2013)

    Article  Google Scholar 

  17. Sayood, K.: Introduction to Data Compression. The Morgan Kaufmann Series in Multimedia Information and Systems, 3rd edn. Elsevier Science, Amsterdam (2005)

    MATH  Google Scholar 

  18. Sheather, S.: A Modern Approach to Regression with R. Springer Texts in Statistics, vol. 02. Springer, New York (2009). https://doi.org/10.1007/978-0-387-09608-7

    Book  MATH  Google Scholar 

  19. Uthayakumar, J., Vengattaraman, T., Dhavachelvan, P.: A survey on data compression techniques: from the perspective of data quality, coding schemes, data type and applications. J. King Saud Univ. Comput. Inf. Sci. (2018)

    Google Scholar 

  20. Wang, W., Liu, G., Liu, D.: Chebyshev similarity match between uncertain time series. Math. Prob. Eng. 2015, 13 (2015). http://search.proquest.com/docview/1722855792/

    MATH  Google Scholar 

  21. Wyner, A., Wyner, A.: Improved redundancy of a version of the Lempel-ziv algorithm. IEEE Trans. Inf. Theory 41(3), 723–731 (1995)

    Article  MathSciNet  Google Scholar 

  22. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)

    Article  MathSciNet  Google Scholar 

  23. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

This research is partially supported by the Australian Queensland Government (Grant No. AQRF12516).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Hua .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, Z., Hua, W., Wang, L., Zhou, X. (2019). SIRCS: Slope-intercept-residual Compression by Correlation Sequencing for Multi-stream High Variation Data. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11446. Springer, Cham. https://doi.org/10.1007/978-3-030-18576-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18576-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18575-6

  • Online ISBN: 978-3-030-18576-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics