Skip to main content

Speeding-Up the Dynamic Programming Procedure for the Edit Distance of Two Strings

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1062))

Included in the following conference series:

  • 569 Accesses

Abstract

We describe a way to compute the edit distance of two strings without having to fill the whole dynamic programming (DP) matrix, through a sequence of increasing guesses on the edit distance. If the strings share a certain degree of similarity, the edit distance can be quite smaller than the value of non-optimal solutions, and a large fraction (up to 80–90%) of the DP matrix cells do not need to be computed. Including the method’s overhead, this translates into a speedup factor from \(3\times \) up to \(30\times \) in the time needed to find the optimal solution for strings of length about 20,000.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM J. Comput. 47(3), 1087–1097 (2018)

    Article  MathSciNet  Google Scholar 

  2. Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)

    Article  Google Scholar 

  3. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, 534 p. Cambridge University Press (1997)

    Google Scholar 

  4. Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms, 456 p. MIT Press (2004)

    Google Scholar 

  5. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)

    MathSciNet  MATH  Google Scholar 

  6. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppe Lancia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lancia, G., Dalpasso, M. (2019). Speeding-Up the Dynamic Programming Procedure for the Edit Distance of Two Strings. In: Anderst-Kotsis, G., et al. Database and Expert Systems Applications. DEXA 2019. Communications in Computer and Information Science, vol 1062. Springer, Cham. https://doi.org/10.1007/978-3-030-27684-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27684-3_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27683-6

  • Online ISBN: 978-3-030-27684-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics