Skip to main content

Real-Time String Filtering of Large Databases Implemented Via a Combination of Artificial Neural Networks

  • Conference paper
Adaptive and Natural Computing Algorithms (ICANNGA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4432))

Included in the following conference series:

  • 1968 Accesses

Abstract

A novel approach to real-time string filtering of large databases is presented. The proposed approach is based on a combination of artificial neural networks and operates in two stages. The first stage employs a self-organizing map for performing approximate string matching and retrieving those strings of the database which are similar to (i.e. assigned to the same SOM node as) the query string. The second stage employs a harmony theory network for comparing the previously retrieved strings in parallel with the query string and determining whether an exact match exists. The experimental results demonstrate accurate, fast and database-size independent string filtering which is robust to database modifications. The proposed approach is put forward for general-purpose (directory, catalogue and glossary search) and Internet (e-mail blocking, intrusion detection systems, URL and username classification) applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyer, R., Moore, S.: A Fast String Matching Algorithm. Comm. ACM 20, 762–772 (1977)

    Article  Google Scholar 

  2. Knuth, D.E., Morris, J., Pratt, V.: Fast Pattern Matching Strings. SIAM J. Comp. 6, 323–350 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  3. Makinen, V., Navarro, G., Ukkonen, E.: Transposition Invariant String Matching. J. Algor. 56, 124–153 (2005)

    Article  MathSciNet  Google Scholar 

  4. Elloumi, M.: Comparison of Strings Belonging to the Same Family. Inform. Sci. 111, 49–63 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  5. Pao, D.C.W., Sun, M.C., Lam, C.H.: An Approximate String Matching Algorithm for on-Line Chinese Character Recognition. Im. Vis. Comp. 15, 695–703 (1997)

    Article  Google Scholar 

  6. Lopresti, D., Tomkins, A.: Block Edit Models for Approximate String Matching. Theoret. Comp. Sci. 181, 159–179 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  7. Parizeau, M., Ghazzali, N., Hebert, J.F.: Optimizing the Cost Matrix for Approximate String Matching Using Genetic Algorithms. Patt. Recogn. 32, 431–440 (1998)

    Article  Google Scholar 

  8. Lemstrom, K., Navarro, G., Pinzon, Y.: Practical Algorithms for Transposition-Invariant String-Matching. J. Dicsr. Alg. 3, 267–292 (2005)

    MathSciNet  Google Scholar 

  9. Deodorowicz, S.: Speeding up Transposition-Invariant String Matching. Inform. Proc. Lett. 100, 14–20 (2006)

    Article  Google Scholar 

  10. Crochemore, M., Gasieniec, L., Rytter, W.: Constant-Space String-Matching in Sublinear Average Time. Theor. Comp. Sci. 218, 197–203 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  11. Misra, J.: Derivation of a Parallel String Matching Algorithm. Inform. Proc. Lett. 85, 255–260 (2005)

    Article  Google Scholar 

  12. Allauzen, C., Raffinot, M.: Simple Optimal String Matching Algorithm. J. Alg. 36, 102–116 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  13. He, L., Fang, B., Sui, J.: The Wide Window String Matching Algorithm. Theor. Comp. Sci. 332, 301–404 (2005)

    Article  MathSciNet  Google Scholar 

  14. Ramesh, H., Vinay, V.: String Matching on quantum time. J. Discr. Alg. 1, 103–110 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  15. Horspool, R.: Practical Fast Searching in Strings. Oft. Pract. & Exper. 10, 501–506 (1980)

    Article  Google Scholar 

  16. Sunday, D.M.: A very Fast Substring Search Algorithm. Comm. ACM 33, 132–142 (1990)

    Article  Google Scholar 

  17. Galil, Z., Park, K.: An Improved Algorithm for Approximate String Matching. SIAM J. Comp 19, 989–999 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  18. Baeza-Yates, R.A., Perleberg, C.H.: Fast and Practical Approximate String Matching. Inf. Proc. Lett. 59, 21–27 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  19. Landau, G., Vishkin, U.: Fast String Matching with k Differences. J. Comp. Sys. Sci. 37, 63–78 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  20. Navarro, G., Baeza-Yates, R.: Very Fast and Simple Approximate String Matching. Inf. Proc. Lett. 72, 65–70 (1999)

    Article  MathSciNet  Google Scholar 

  21. Holub, J., Melichar, B.: Approximate String Matching Using Factor Automata. Theor. Comp. Sci. 249, 305–311 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  22. Choffrut, C., Haddad, Y.: String-Matching with OBDDs. Theor. Comp. Sci. 320, 187–198 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  23. Hyyro, H.: Bit-Parallel Approximate String Matching Algorithms with Transposition. J. Discr. Alg. 3, 215–229 (2005)

    Article  MathSciNet  Google Scholar 

  24. Navarro, G., Chavez, E.: A Metric Index for Approximate String Matching. Theor. Comp. Sci. 352, 266–279 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  25. Nebel, M.E.: Fast String Matching by Using Probabilities: an Optimal Mismatch Variant of Horspool’s Algorithm. Theor. Comp. 359, 329–343 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  26. Levenshtein, A.: Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Sov. Phy. Dokl. 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  27. Manber, U., Myers, E.W.: Suffix Arrays: a New Method for On-Line String Searches. SIAM J. on Comp. 22, 935–948 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  28. Moffat, A., Zobel, J.: Self-Indexing Inverted Files for Fast Text Retrieval. ACM Trans. Onf. Sys. 14, 349–379 (1996)

    Article  Google Scholar 

  29. Ferragina, P., Grossi, R.: The String B-Tree: a New Structure for String Search in External Memory and Application. J. of ACM 46, 236–280 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  30. Bentley, J., Sedgewick, R.: Fast Algorithms for Sorting and Searching Strings. In: Proc. Of the ACM-SIAM Symposium on Discrete Algorithms, pp. 360–369 (1997)

    Google Scholar 

  31. Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexng and String Matching. In: Proc. Of the 3rd Annual ACM Symposiom on Theory of Computation, pp. 397–406 (2000), also in SIAM J. on Comp. 35 (2005)

    Google Scholar 

  32. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  33. Smolensky, P.: Information Processing in Dynamical Systems: Foundations of Harmony Theory. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 194–281. MIT Press, Cambrige (1986)

    Google Scholar 

  34. Tambouratzis, T.: String Matching Artificial Neural Networks. Int. J. Neur. Syst. 11, 445–453 (2001)

    Google Scholar 

  35. Tambouratzis, T.: A Novel Artificial Neural Network for Sorting. IEEE Trans. Syst., Man & Cybern. 29, 271–275 (1999)

    Article  Google Scholar 

  36. Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: SOM Toolbox for Matlab 5. Report A57, SOM Toolbox Team, Helsinki University of Technology, Finland (2000), available at http://www.cis.hut.fi/projects/somtoolbox

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bartlomiej Beliczynski Andrzej Dzielinski Marcin Iwanowski Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Tambouratzis, T. (2007). Real-Time String Filtering of Large Databases Implemented Via a Combination of Artificial Neural Networks. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2007. Lecture Notes in Computer Science, vol 4432. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71629-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71629-7_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71590-0

  • Online ISBN: 978-3-540-71629-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics