Abstract
In the classic dictionary matching problem, the input is a dictionary of patterns \(\mathcal {D}=\{P_1,P_2,\ldots ,P_k\}\) and a text T, and the goal is to report all the occurrences in T of every pattern from \(\mathcal {D}\). In the dynamic version of the dictionary matching problem, patterns may be either added or removed from \(\mathcal {D}\). In the online version of the dictionary matching problem, the characters of T arrive online, one at a time, and the goal is to establish, immediately after every new character arrival, which of the patterns in \(\mathcal {D}\) are a suffix of the current text.
In this paper, we consider the dynamic version of the online dictionary matching problem. For the case where all the patterns have the same length m, we design an algorithm that adds or removes a pattern in \(\mathcal {O}(m\log \log \Vert \mathcal {D}\Vert )\) time and processes a text character in \(O(\log \log \Vert \mathcal {D}\Vert )\) time, where \(\Vert \mathcal {D}\Vert = \sum _{P\in \mathcal {D}} |P|\). For the general case where patterns may have different lengths, the cost of adding or removing a pattern P is \(\mathcal {O}(|P|\log \log \Vert \mathcal {D}\Vert + \log d /\log \log d)\) while the cost per text character is \(\mathcal {O}(\log \log \Vert \mathcal {D}\Vert + (1+occ)\log d /\log \log d)\), where \(d=|\mathcal {D}|\) is the number of patterns in \(\mathcal {D}\) and occ is the size of the output. These bounds improve on the state of the art for dynamic dictionary matching, while also providing online features. All our algorithms are Las-Vegas randomized and the time costs are in the worst-case with high probability. A by-product of our work is a solution for the fringed colored ancestor problem, resolving an open question of Breslauer and Italiano [J. Discrete Algorithms, 2013].
This research is supported by ISF grants no. 824/17 and 1278/16 and by an ERC grant MPM under the EU’s Horizon 2020 Research and Innovation Programme (grant no. 683064). The authors thank Tatiana Starikovskaya for early discussions on the online dynamic dictionary problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Throughout this paper, an event happens with high probability (whp) if the probability of the event not happening is polynomially small in the size of the input.
- 2.
Sahinalp and Vishkin [23] claim that their results can be extended to general dictionaries, but the details were left for a full version that was never made available. The missing details are unclear and do not seem to be straightforward.
- 3.
This is in contrast to the AC failure-links, which lead to the locus of the longest prefix of some pattern in \(\mathcal {D}\) that is also a proper suffix of S(u).
- 4.
That is, for each prefix of the text, the algorithm finds the longest suffix that is a substring of some pattern in \(\mathcal {D}\). This is in contrast to [2] where the goal is to find, for each suffix of T, the longest prefix of the suffix that is a substring of some pattern in \(\mathcal {D}\). Notice that in general the sets of these strings is not necessarily the same.
- 5.
Theorem 5.1 in [21] is expressed in terms of expected runtimes. However, the only randomization is due to hashing, and the same time bounds hold whp.
- 6.
Recall that the pointers to fragments are one of the standard ways of storing edge labels in a GST.
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Amir, A., Farach, M., Galil, Z., Giancarlo, R., Park, K.: Dynamic dictionary matching. J. Comput. Syst. Sci. 49(2), 208–222 (1994)
Amir, A., Farach, M., Idury, R.M., Poutré, J.A.L., Schäffer, A.A.: Improved dynamic dictionary matching. Inf. Comput. 119(2), 258–282 (1995)
Amir, A., Kopelowitz, T., Levy, A., Pettie, S., Porat, E., Shalom, B.R.: Mind the gap! Online dictionary matching with one gap. Algorithmica 81(6), 2123–2157 (2019)
Amir, A., Levy, A., Porat, E., Shalom, B.R.: Dictionary matching with one gap. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds.) CPM 2014. LNCS, vol. 8486, pp. 11–20. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07566-2_2
Amir, A., Levy, A., Porat, E., Shalom, B.R.: Dictionary matching with a few gaps. Theor. Comput. Sci. 589, 34–46 (2015)
Athar, T., et al.: Fast circular dictionary-matching algorithm. Math. Struct. Comput. Sci. 27(2), 143–156 (2017)
Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13509-5_9
Bremler-Barr, A., Hay, D., Koral, Y.: Compactdfa: scalable pattern matching using longest prefix match solutions. IEEE/ACM Trans. Netw. 22(2), 415–428 (2014)
Breslauer, D., Italiano, G.F.: Near real-time suffix tree construction via the fringe marked ancestor problem. J. Discrete Algorithms 18, 32–48 (2013)
Clifford, R., Fontaine, A., Porat, E., Sach, B., Starikovskaya, T.: Dictionary matching in a stream. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 361–372. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_31
Cole, R., Hariharan, R.: Dynamic LCA queries on trees. SIAM J. Comput. 34(4), 894–923 (2005)
Feigenblat, G., Porat, E., Shiftan, A.: Linear time succinct indexable dictionary construction with applications. In: 2016 Data Compression Conference, DCC, pp. 13–22 (2016)
Feigenblat, G., Porat, E., Shiftan, A.: A grouping approach for succinct dynamic dictionary matching. Algorithmica 77(1), 134–150 (2017)
Fischer, J., Gagie, T., Gawrychowski, P., Kociumaka, T.: Approximating LZ77 via small-space multiple-pattern matching. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 533–544. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_45
Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with \(O(1)\) worst case access time. J. ACM 31(3), 538–544 (1984)
Ganguly, A., Hon, W., Sadakane, K., Shah, R., Thankachan, S.V., Yang, Y.: Space-efficient dictionaries for parameterized and order-preserving pattern matching. In: 27th Annual Symposium on Combinatorial Pattern Matching, CPM, LIPIcs, vol. 54, pp. 2:1–2:12 (2016)
Ganguly, A., Hon, W., Shah, R.: A framework for dynamic parameterized dictionary matching. In: 15th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT, LIPIcs, vol. 53, pp. 10:1–10:14 (2016)
Golan, S., Kopelowitz, T., Porat, E.: Towards optimal approximate streaming pattern matching by matching multiple patterns in multiple streams. In: 45th International Colloquium on Automata, Languages, and Programming, ICALP, LIPIcs, vol. 107, pp. 65:1–65:16 (2018)
Golan, S., Porat, E.: Real-time streaming multi-pattern search for constant alphabet. In: 25th Annual European Symposium on Algorithms, ESA, LIPIcs, vol. 87, pp. 41:1–41:15 (2017)
Kopelowitz, T.: On-line indexing for general alphabets via predecessor queries on subsets of an ordered list. In: 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pp. 283–292 (2012)
Kopelowitz, T., Porat, E., Rozen, Y.: Succinct online dictionary matching with improved worst-case guarantees. In: 27th Annual Symposium on Combinatorial Pattern Matching, CPM, LIPIcs, vol. 54, pp. 6:1–6:13 (2016)
Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: 37th Annual Symposium on Foundations of Computer Science, FOCS, pp. 320–328 (1996)
Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: 32st International Symposium on Computer Architecture, ISCA, pp. 112–122 (2005)
Tuck, N., Sherwood, T., Calder, B., Varghese, G.: Deterministic memory-efficient string matching algorithms for intrusion detection. In: 23rd IEEE International Conference on Computer Communications, INFCOM, pp. 2628–2639 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Golan, S., Kociumaka, T., Kopelowitz, T., Porat, E. (2019). Dynamic Dictionary Matching in the Online Model. In: Friggstad, Z., Sack, JR., Salavatipour, M. (eds) Algorithms and Data Structures. WADS 2019. Lecture Notes in Computer Science(), vol 11646. Springer, Cham. https://doi.org/10.1007/978-3-030-24766-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-24766-9_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24765-2
Online ISBN: 978-3-030-24766-9
eBook Packages: Computer ScienceComputer Science (R0)