Skip to main content

Dynamic Dictionary Matching in the Online Model

  • Conference paper
  • First Online:
Algorithms and Data Structures (WADS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11646))

Included in the following conference series:

Abstract

In the classic dictionary matching problem, the input is a dictionary of patterns \(\mathcal {D}=\{P_1,P_2,\ldots ,P_k\}\) and a text T, and the goal is to report all the occurrences in T of every pattern from \(\mathcal {D}\). In the dynamic version of the dictionary matching problem, patterns may be either added or removed from \(\mathcal {D}\). In the online version of the dictionary matching problem, the characters of T arrive online, one at a time, and the goal is to establish, immediately after every new character arrival, which of the patterns in \(\mathcal {D}\) are a suffix of the current text.

In this paper, we consider the dynamic version of the online dictionary matching problem. For the case where all the patterns have the same length m, we design an algorithm that adds or removes a pattern in \(\mathcal {O}(m\log \log \Vert \mathcal {D}\Vert )\) time and processes a text character in \(O(\log \log \Vert \mathcal {D}\Vert )\) time, where \(\Vert \mathcal {D}\Vert = \sum _{P\in \mathcal {D}} |P|\). For the general case where patterns may have different lengths, the cost of adding or removing a pattern P is \(\mathcal {O}(|P|\log \log \Vert \mathcal {D}\Vert + \log d /\log \log d)\) while the cost per text character is \(\mathcal {O}(\log \log \Vert \mathcal {D}\Vert + (1+occ)\log d /\log \log d)\), where \(d=|\mathcal {D}|\) is the number of patterns in \(\mathcal {D}\) and occ is the size of the output. These bounds improve on the state of the art for dynamic dictionary matching, while also providing online features. All our algorithms are Las-Vegas randomized and the time costs are in the worst-case with high probability. A by-product of our work is a solution for the fringed colored ancestor problem, resolving an open question of Breslauer and Italiano [J. Discrete Algorithms, 2013].

This research is supported by ISF grants no. 824/17 and 1278/16 and by an ERC grant MPM under the EU’s Horizon 2020 Research and Innovation Programme (grant no. 683064). The authors thank Tatiana Starikovskaya for early discussions on the online dynamic dictionary problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Throughout this paper, an event happens with high probability (whp) if the probability of the event not happening is polynomially small in the size of the input.

  2. 2.

    Sahinalp and Vishkin [23] claim that their results can be extended to general dictionaries, but the details were left for a full version that was never made available. The missing details are unclear and do not seem to be straightforward.

  3. 3.

    This is in contrast to the AC failure-links, which lead to the locus of the longest prefix of some pattern in \(\mathcal {D}\) that is also a proper suffix of S(u).

  4. 4.

    That is, for each prefix of the text, the algorithm finds the longest suffix that is a substring of some pattern in \(\mathcal {D}\). This is in contrast to [2] where the goal is to find, for each suffix of T, the longest prefix of the suffix that is a substring of some pattern in \(\mathcal {D}\). Notice that in general the sets of these strings is not necessarily the same.

  5. 5.

    Theorem 5.1 in [21] is expressed in terms of expected runtimes. However, the only randomization is due to hashing, and the same time bounds hold whp.

  6. 6.

    Recall that the pointers to fragments are one of the standard ways of storing edge labels in a GST.

References

  1. Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  Google Scholar 

  2. Amir, A., Farach, M., Galil, Z., Giancarlo, R., Park, K.: Dynamic dictionary matching. J. Comput. Syst. Sci. 49(2), 208–222 (1994)

    Article  MathSciNet  Google Scholar 

  3. Amir, A., Farach, M., Idury, R.M., Poutré, J.A.L., Schäffer, A.A.: Improved dynamic dictionary matching. Inf. Comput. 119(2), 258–282 (1995)

    Article  MathSciNet  Google Scholar 

  4. Amir, A., Kopelowitz, T., Levy, A., Pettie, S., Porat, E., Shalom, B.R.: Mind the gap! Online dictionary matching with one gap. Algorithmica 81(6), 2123–2157 (2019)

    Article  MathSciNet  Google Scholar 

  5. Amir, A., Levy, A., Porat, E., Shalom, B.R.: Dictionary matching with one gap. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds.) CPM 2014. LNCS, vol. 8486, pp. 11–20. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07566-2_2

    Chapter  MATH  Google Scholar 

  6. Amir, A., Levy, A., Porat, E., Shalom, B.R.: Dictionary matching with a few gaps. Theor. Comput. Sci. 589, 34–46 (2015)

    Article  MathSciNet  Google Scholar 

  7. Athar, T., et al.: Fast circular dictionary-matching algorithm. Math. Struct. Comput. Sci. 27(2), 143–156 (2017)

    Article  MathSciNet  Google Scholar 

  8. Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13509-5_9

    Chapter  MATH  Google Scholar 

  9. Bremler-Barr, A., Hay, D., Koral, Y.: Compactdfa: scalable pattern matching using longest prefix match solutions. IEEE/ACM Trans. Netw. 22(2), 415–428 (2014)

    Article  Google Scholar 

  10. Breslauer, D., Italiano, G.F.: Near real-time suffix tree construction via the fringe marked ancestor problem. J. Discrete Algorithms 18, 32–48 (2013)

    Article  MathSciNet  Google Scholar 

  11. Clifford, R., Fontaine, A., Porat, E., Sach, B., Starikovskaya, T.: Dictionary matching in a stream. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 361–372. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_31

    Chapter  Google Scholar 

  12. Cole, R., Hariharan, R.: Dynamic LCA queries on trees. SIAM J. Comput. 34(4), 894–923 (2005)

    Article  MathSciNet  Google Scholar 

  13. Feigenblat, G., Porat, E., Shiftan, A.: Linear time succinct indexable dictionary construction with applications. In: 2016 Data Compression Conference, DCC, pp. 13–22 (2016)

    Google Scholar 

  14. Feigenblat, G., Porat, E., Shiftan, A.: A grouping approach for succinct dynamic dictionary matching. Algorithmica 77(1), 134–150 (2017)

    Article  MathSciNet  Google Scholar 

  15. Fischer, J., Gagie, T., Gawrychowski, P., Kociumaka, T.: Approximating LZ77 via small-space multiple-pattern matching. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 533–544. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_45

    Chapter  Google Scholar 

  16. Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with \(O(1)\) worst case access time. J. ACM 31(3), 538–544 (1984)

    Article  MathSciNet  Google Scholar 

  17. Ganguly, A., Hon, W., Sadakane, K., Shah, R., Thankachan, S.V., Yang, Y.: Space-efficient dictionaries for parameterized and order-preserving pattern matching. In: 27th Annual Symposium on Combinatorial Pattern Matching, CPM, LIPIcs, vol. 54, pp. 2:1–2:12 (2016)

    Google Scholar 

  18. Ganguly, A., Hon, W., Shah, R.: A framework for dynamic parameterized dictionary matching. In: 15th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT, LIPIcs, vol. 53, pp. 10:1–10:14 (2016)

    Google Scholar 

  19. Golan, S., Kopelowitz, T., Porat, E.: Towards optimal approximate streaming pattern matching by matching multiple patterns in multiple streams. In: 45th International Colloquium on Automata, Languages, and Programming, ICALP, LIPIcs, vol. 107, pp. 65:1–65:16 (2018)

    Google Scholar 

  20. Golan, S., Porat, E.: Real-time streaming multi-pattern search for constant alphabet. In: 25th Annual European Symposium on Algorithms, ESA, LIPIcs, vol. 87, pp. 41:1–41:15 (2017)

    Google Scholar 

  21. Kopelowitz, T.: On-line indexing for general alphabets via predecessor queries on subsets of an ordered list. In: 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pp. 283–292 (2012)

    Google Scholar 

  22. Kopelowitz, T., Porat, E., Rozen, Y.: Succinct online dictionary matching with improved worst-case guarantees. In: 27th Annual Symposium on Combinatorial Pattern Matching, CPM, LIPIcs, vol. 54, pp. 6:1–6:13 (2016)

    Google Scholar 

  23. Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: 37th Annual Symposium on Foundations of Computer Science, FOCS, pp. 320–328 (1996)

    Google Scholar 

  24. Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: 32st International Symposium on Computer Architecture, ISCA, pp. 112–122 (2005)

    Google Scholar 

  25. Tuck, N., Sherwood, T., Calder, B., Varghese, G.: Deterministic memory-efficient string matching algorithms for intrusion detection. In: 23rd IEEE International Conference on Computer Communications, INFCOM, pp. 2628–2639 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tsvi Kopelowitz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Golan, S., Kociumaka, T., Kopelowitz, T., Porat, E. (2019). Dynamic Dictionary Matching in the Online Model. In: Friggstad, Z., Sack, JR., Salavatipour, M. (eds) Algorithms and Data Structures. WADS 2019. Lecture Notes in Computer Science(), vol 11646. Springer, Cham. https://doi.org/10.1007/978-3-030-24766-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24766-9_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24765-2

  • Online ISBN: 978-3-030-24766-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics