Skip to main content
Log in

Overlap digraph: An effective model for finding good spaced seeds for biological sequence local alignment

  • Article
  • Bioinformatics
  • Published:
Chinese Science Bulletin

Abstract

Spaced seeds technology, which was proposed by PatternHunter, has been proven to be more sensitive and faster than continuous seeds, and it is now widely used for bio-sequence local alignments. However, finding optimal spaced seeds is an NP-hard problem. A seed digraph model is proposed to find good spaced seeds, which are very close to optimal, in a very different but effective way. Using this different approach, some good long spaced seeds which cannot be calculated by normal optimal sensitivity formulas due to their exponential complexity can be found.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Needleman S B, Wunsch C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol, 1970, 48: 443–453

    Article  Google Scholar 

  2. Smith T F, Waterman M S. Identification of common molecular subsequences. J Comput Biol, 1981, 147: 195–197

    Google Scholar 

  3. Lipman D J, Pearson W R. Rapid and sensitive protein similarity searches. Science, 1985, 227: 1435–1441

    Article  Google Scholar 

  4. Altschul S F, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol, 1990, 215: 403–410

    Google Scholar 

  5. Altschul S F, Madden T L, Schffer A, et al. Gapped Blast and Psi-Blast: A new generation of protein database search programs. Nucl Acids Res, 1997, 25: 3389–3402

    Article  Google Scholar 

  6. Ma B, Tromp J, Li M. PatternHunter: Faster and more sensitive homology search. Bioinformatics, 2002, 18: 440–445

    Article  Google Scholar 

  7. Brona B, Daniel G B, Tomas V. Optimal spaced seeds for homologous coding regions. J Bioinform Comput Bio, 2004, 1: 595–610

    Article  Google Scholar 

  8. Choi K, Zeng F F, Zhang L. Good spaced seeds for homology search. Bioinformatics, 2004, 20: 1053–1059

    Article  Google Scholar 

  9. Choi K P, Zhang L. Sensitivity analysis and efficient method for identifying optimal spaced seeds. J Comput Syst Sci, 2003, 68: 22–40

    Article  Google Scholar 

  10. Keich U, Li M, Ma B, et al. On spaced seeds for similarity search. Discrete Appl Math, 2004, 138: 253–263

    Article  Google Scholar 

  11. Buhler J, Keich U, Sun Y. Designing seeds for similarity search in genomic DNA. J Comput Syst Sci, 2005, 70: 342–363

    Article  Google Scholar 

  12. Ma B, Yao H. Seed optimization for i.i.d. similarities is no easier than optimal Golomb ruler design. Inf Proce Lett, 2009, 109: 1120–1124

    Article  Google Scholar 

  13. Ilie L, Ilie S. Long spaced seeds for finding similarities between biological sequences. In: BIOCOMP’07, June 25–28, 2007, Las Vegas, 3–8

  14. Ilie L, Ilie S. Multiple spaced seeds for homology search. Bioinformatics, 2007, 23: 2969–2977

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke Chen.

Additional information

This article is published with open access at Springerlink.com

About this article

Cite this article

Chen, K., She, K. & Zhu, Q. Overlap digraph: An effective model for finding good spaced seeds for biological sequence local alignment. Chin. Sci. Bull. 56, 1100–1107 (2011). https://doi.org/10.1007/s11434-010-4161-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11434-010-4161-9

Keywords

Navigation