Skip to main content

On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10508))

Included in the following conference series:

Abstract

We investigate two closely related LZ78-based compression schemes: LZMW (an old scheme by Miller and Wegman) and LZD (a recent variant by Goto et al.). Both LZD and LZMW naturally produce a grammar for a string of length n; we show that the size of this grammar can be larger than the size of the smallest grammar by a factor \(\varOmega (n^{\frac{1}{3}})\) but is always within a factor \(O((\frac{n}{\log n})^{\frac{2}{3}})\). In addition, we show that the standard algorithms using \(\varTheta (z)\) working space to construct the LZD and LZMW parsings, where z is the size of the parsing, work in \(\varOmega (n^{\frac{5}{4}})\) time in the worst case. We then describe a new Las Vegas LZD/LZMW parsing algorithm that uses \(O (z \log n)\) space and \(O(n + z \log ^2 n)\) time w.h.p.

G. Badkobeh—Supported by the Leverhulme Trust’s Early Career Scheme.

T. Kociumaka—Supported by Polish budget funds for science in 2013–2017 under the ‘Diamond Grant’ program.

S.J. Puglisi—Supported by the Academy of Finland via grant 294143.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We concern ourselves here with LZD parsing, but it should be easy for the reader to see that the algorithms are trivially adapted to instead compute LZMW.

References

  1. Supplementary materials for the present paper: C++ code for described experiments. https://bitbucket.org/dkosolobov/lzd-lzmw

  2. Belazzougui, D., Boldi, P., Vigna, S.: Dynamic Z-Fast tries. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 159–172. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16321-0_15

    Chapter  Google Scholar 

  3. Belazzougui, D., Cording, P.H., Puglisi, S.J., Tabei, Y.: Access, rank, and select in grammar-compressed strings. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 142–154. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48350-3_13

    Chapter  Google Scholar 

  4. Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings and trees. SIAM J. Comput. 44(3), 513–539 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  5. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theor. 51(7), 2554–2576 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Claude, F., Navarro, G.: Self-indexed grammar-based compression. Fundamenta Informaticae 111(3), 313–337 (2011)

    MathSciNet  MATH  Google Scholar 

  7. Gagie, T., Gawrychowski, P., Kärkkäinen, J., Nekrich, Y., Puglisi, S.J.: A faster grammar-based self-index. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 240–251. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28332-1_21

    Chapter  Google Scholar 

  8. Goto, K., Bannai, H., Inenaga, S., Takeda, M.: LZD Factorization: simple and practical online grammar compression with variable-to-fixed encoding. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 219–230. Springer, Cham (2015). doi:10.1007/978-3-319-19929-0_19

  9. Hucke, D., Lohrey, M., Reh, C.P.: The smallest grammar problem revisited. In: Inenaga, S., Sadakane, K., Sakai, T. (eds.) SPIRE 2016. LNCS, vol. 9954, pp. 35–49. Springer, Cham (2016). doi:10.1007/978-3-319-46049-9_4

    Chapter  Google Scholar 

  10. I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Efficient Lyndon factorization of grammar compressed text. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 153–164. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38905-4_16

  11. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Devel. 31(2), 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kempa, D., Kosolobov, D.: LZ-End parsing in compressed space. In: Proceedings of Data Compression Conference (DCC), pp. 350–359. IEEE (2017)

    Google Scholar 

  13. Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theoret. Comput. Sci. 483, 115–133 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  14. Miller, V.S., Wegman, M.N.: Variations on a theme by Ziv and Lempel. In: Apostolico, A., Galil, Z. (eds.) Proceedings of NATO Advanced Research Workshop on Combinatorial Algorithms on Words, NATO ASI, vol. 12, pp. 131–140. Springer, Heidelberg (1985)

    Google Scholar 

  15. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoret. Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  16. Tanaka, T., I, T., Inenaga, S., Bannai, H., Takeda, M.: Computing convolution on grammar-compressed text. In: Proceedings of Data Compression Conference (DCC), pp. 451–460. IEEE (2013)

    Google Scholar 

  17. Westbrook, J.: Fast incremental planarity testing. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 342–353. Springer, Heidelberg (1992). doi:10.1007/3-540-55719-9_86

    Chapter  Google Scholar 

  18. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theor. 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank H. Bannai, P. Cording, K. Dabrowski, D. Hücke, D. Kempa, L. Salmela for interesting discussions on LZD at the 2016 StringMasters and Dagstuhl meetings. Thanks also go to D. Belazzougui for advice about the z-fast trie and to the anonymous referees.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitry Kosolobov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Badkobeh, G., Gagie, T., Inenaga, S., Kociumaka, T., Kosolobov, D., Puglisi, S.J. (2017). On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation. In: Fici, G., Sciortino, M., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2017. Lecture Notes in Computer Science(), vol 10508. Springer, Cham. https://doi.org/10.1007/978-3-319-67428-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67428-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67427-8

  • Online ISBN: 978-3-319-67428-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics