Skip to main content

Controlling Size When Aligning Multiple Genomic Sequences with Duplications

  • Conference paper
Algorithms in Bioinformatics (WABI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4175))

Included in the following conference series:

Abstract

For a genomic region containing a tandem gene cluster, a proper set of alignments needs to align only orthologous segments, i.e., those separated by a speciation event. Otherwise, methods for finding regions under evolutionary selection will not perform properly. Conversely, the alignments should indicate every orthologous pair of genes or genomic segments. Attaining this goal in practice requires a technique for avoiding a combinatorial explosion in the number of local alignments. To better understand this process, we model it as a graph problem of finding a minimum cardinality set of cliques that contain all edges. We provide an upper bound for an important class of graphs (the problem is NP-hard and very difficult to approximate in the general case), and use the bound and computer simulations to evaluate two heuristic solutions. An implementation of one of them is evaluated on mammalian sequences from the α-globin gene cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berman, P.: Relationship between density and deterministic complexity of NP-complete languages. In: Ausiello, G., Böhm, C. (eds.) ICALP 1978. LNCS, vol. 62, pp. 63–71. Springer, Heidelberg (1978)

    Google Scholar 

  2. Blanchette, M., et al.: Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research 14, 708–715 (2004)

    Article  Google Scholar 

  3. Cacceta, L., Erdos, P., Ordman, E.T., Pullman, N.J.: On the difference between clique numbers of a graph. Ars Combinatoria 19A, 97–106 (1985)

    Google Scholar 

  4. Cavers, M.: Clique partitions and coverings of graphs (Masters thesis, University of Waterloo) (2005)

    Google Scholar 

  5. Cooper, G.M., et al.: Distribution and intensity of constraint in mammalian genomic sequences. Genome Research 15, 901–913 (2005)

    Article  Google Scholar 

  6. Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)

    Article  Google Scholar 

  7. Fitch, W.M.: Homology, a personal view on some problems. Trends Genet. 16, 227–231 (2000)

    Article  Google Scholar 

  8. Gramm, J., et al.: Data reduction, exact, and heuristic algorithms for clique cover. In: ALENEX, pp. 86–94 (2006)

    Google Scholar 

  9. Gregory, D.A., Pullman, N.J.: On a clique covering problem of Orlin. Discrete Math. 41, 97–99 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  10. Hall Jr., M.: A problem in partition. Bull. Amer. Math. Soc. 47, 801–807 (1941)

    Google Scholar 

  11. Hou, M., et al.: Aligning multiple genomic sequences that contain duplications (manuscript)

    Google Scholar 

  12. Hughes, J.R., et al.: Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences. Proc. Natl. Acad. Sci. USA 102, 9830–9835 (2005)

    Article  Google Scholar 

  13. Kou, L.T., et al.: Covering edges by cliques with regard to keyword conflicts and intersection graphs. Communications of the ACM 21(2), 135–139 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  14. Lund, C., Yannakakis, M.: On the hardness of approximation minimization problems. J. Assoc. for Comput. Mach. 41, 961–981 (1994)

    MathSciNet  Google Scholar 

  15. Margulies, E.H., et al.: Relationship between evolutionary constraint and genome function in 1% of the human genome. Nature (submitted)

    Google Scholar 

  16. Margulies, E.H., et al.: Annotation of the human genome through comparisons of diverse mammalian sequences. Genome Research (submitted)

    Google Scholar 

  17. Orlin, J.: Contentment in graph theory: covering graphs with cliques. Indag. Math. 39, 406–424 (1977)

    MathSciNet  Google Scholar 

  18. Pullman, N.J., Donald, A.: Clique coverings of graphs II: complements of cliques. Utilitas Math. 19, 207–213 (1981)

    MATH  MathSciNet  Google Scholar 

  19. Pullman, N.J.: Clique coverings of graphs IV: algorithms. SIAM J. on Computing 13, 57–75 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  20. Schwartz, S., et al.: Human-Mouse Alignments with BLASTZ. Genome Res. 13(1), 103–107 (2003)

    Article  Google Scholar 

  21. Siepel, A., et al.: Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Research 15, 1034–1050 (2005)

    Article  Google Scholar 

  22. The ENCODE Project Consortium: The ENCODE (ENCyclopedia of DNA Elements) Project. Science 306, 636–640 (2004)

    Google Scholar 

  23. Wakefield, M.J., Maxwell, P., Huttley, G.A.: Vestige: maximum likelihood phylogenetic footprinting. BMC Bioinformatics 6, 130 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hou, M., Berman, P., Zhang, L., Miller, W. (2006). Controlling Size When Aligning Multiple Genomic Sequences with Duplications. In: Bücher, P., Moret, B.M.E. (eds) Algorithms in Bioinformatics. WABI 2006. Lecture Notes in Computer Science(), vol 4175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11851561_13

Download citation

  • DOI: https://doi.org/10.1007/11851561_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39583-6

  • Online ISBN: 978-3-540-39584-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics