Skip to main content

Compact Directed Acyclic Word Graphs for a Sliding Window

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2476))

Included in the following conference series:

Abstract

The suffix tree is a well-known and widely-studied data structure that is highly useful for string matching. The suffix tree of a string w can be constructed in O(n) time and space, where n denotes the length of w. Larsson achieved an efficient algorithm to maintain a suffix tree for a sliding window. It contributes to prediction by partial matching (PPM) style statistical data compression scheme. The compact directed acyclic word graph (CDAWG) is a more space-economical data structure for indexing a string. In this paper we propose a linear-time algorithm to maintain a CDAWG for a sliding window.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Apostolico. The myriad virtues of subword trees. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithm on Words, volume 12 of NATO Advanced Science Institutes, Series F, pages 85–96. Springer-Verlag, 1985.

    Google Scholar 

  2. A. Blumer, J. Blumer, D. Haussler, R. McConnell, and A. Ehrenfeucht. Complete inverted files for efficient text retrieval and analysis. J. ACM, 34(3):578–595, 1987.

    Article  MathSciNet  Google Scholar 

  3. J. G. Cleary, W. J. Teahan, and I. H. Witten. Unbounded length contexts for PPM. In Proc. Data Compression Conference’ 95 (DCC’95), pages 52–61. IEEE Computer Society, 1995.

    Google Scholar 

  4. J. G. Cleary and I. H. Witten. Data compression using adaptive coding and partial string matching. IEEE Trans. Commun., 32(4):396–402, 1984.

    Article  Google Scholar 

  5. M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, New York, 1994.

    MATH  Google Scholar 

  6. M. Crochemore and R. Vérin. On compact directed acyclic word graphs. In J. Mycielski, G. Rozenberg, and A. Salomaa, editors, Structures in Logic and Computer Science, volume 1261 of Lecture Notes in Computer Science, pages 192–211. Springer-Verlag, 1997.

    Google Scholar 

  7. E. R. Fiala and D. H. Greene. Data compression with finite windows. Commun. ACM, 32(4):490–505, 1989.

    Article  Google Scholar 

  8. D. Gusfield. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, New York, 1997.

    MATH  Google Scholar 

  9. S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa. Construction of the CDAWG for a trie. In Proc. The Prague Stringology Conference’ 01 (PSC’01). Czech Technical University, 2001.

    Google Scholar 

  10. S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa. On-line construction of symmetric compact directed acyclic word graphs. In Proc. 8th International Symposium on String Processing and Information Retrieval (SPIRE’01), pages 96–110. IEEE Computer Society, 2001.

    Google Scholar 

  11. S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, S. Arikawa, G. Mauri, and G. Pavesi. On-line construction of compact directed acyclic word graphs. In A. Amir and G. M. Landau, editors, Proc. 12th Annual Symposium on Combinatorial Pattern Matching (CPM’01), volume 2089 of Lecture Notes in Computer Science, pages 169–180. Springer-Verlag, 2001.

    Google Scholar 

  12. S. Inenaga, A. Shinohara, M. Takeda, H. Bannai, and S. Arikawa. Space-economical construction of index structures for all suffixes of a string. In Proc. 27th International Symposium on Mathematical Foundations of Computer Science (MFCS’02), Lecture Notes in Computer Science. Springer-Verlag, 2002. (to appear).

    Google Scholar 

  13. N. J. Larsson. Extended application of suffix trees to data compression. In Proc. Data Compression Conference’ 96 (DCC’96), pages 190–199. IEEE Computer Society, 1996.

    Google Scholar 

  14. N. J. Larsson. Structures of String Matching and Data Compression. PhD thesis, Lund University, 1999.

    Google Scholar 

  15. E. M. McCreight. A space-economical suffix tree construction algorithm. J. ACM, 23(2):262–272, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  16. A. Moffat. Implementing the PPM data compression scheme. IEEE Trans. Commun., 38(11):1917–1921, 1990.

    Article  Google Scholar 

  17. E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3):249–260, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  18. P. Weiner. Linear pattern matching algorithms. In Proc. 14th Annual Symposium on Switching and Automata Theory, pages 1–11, 1973

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Inenaga, S., Shinohara, A., Takeda, M., Arikawa, S. (2002). Compact Directed Acyclic Word Graphs for a Sliding Window. In: Laender, A.H.F., Oliveira, A.L. (eds) String Processing and Information Retrieval. SPIRE 2002. Lecture Notes in Computer Science, vol 2476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45735-6_27

Download citation

  • DOI: https://doi.org/10.1007/3-540-45735-6_27

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44158-8

  • Online ISBN: 978-3-540-45735-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics