Abstract
Understanding recombination is a central problem in population genetics. In this paper, we address an established problem in Computational Biology: compute lower bounds on the minimum number of historical recombinations for generating a set of sequences [11,13,9,1,2,15]. In particular, we propose a new recombination lower bound: the forest bound. We show that the forest bound can be formulated as the minimum perfect phylogenetic forest problem, a natural extension to the classic binary perfect phylogeny problem, which may be of interests on its own. We then show that the forest bound is provably higher than the optimal haplotype bound [13], a very good lower bound in practice [15]. We prove that, like several other lower bounds [2], computing the forest bound is NP-hard. Finally, we describe an integer linear programming (ILP) formulation that computes the forest bound precisely for certain range of data. Simulation results show that the forest bound may be useful in computing lower bounds for low quality data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bafna, V., Bansal, V.: The number of recombination events in a sample history: conflict graph and lower bounds. IEEE/ACM Trans. on Computational Biology and Bioinformatics 1, 78–90 (2004)
Bafna, V., Bansal, V.: Inference about Recombination from Haplotype Data: Lower Bounds and Recombination Hotspots. J. of Comp. Bio. 13, 501–521 (2006)
Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)
Foulds, L.R., Graham, R.L.: The Steiner Tree in Phylogeny is NP-complete, Advances in Applied Math. 3 (1982)
Garey, M., Johnson, D.: Computers and intractability, Freeman (1979)
Griffiths, R.C., Marjoram, P.: Ancestral inference from samples of DNA sequences with recombination. J. of Comp. Bio. 3, 479–502 (1996)
Gusfield, D.: Efficient algorithms for inferring evolutionary history. Networks 21, 19–28 (1991)
Gusfield, D., Eddhu, S., Langley, C.: Optimal, efficient reconstruction of phylogenetic networks with constrained recombination. J. Bioinformatics and Computational Biology 2, 173–213 (2004)
Gusfield, D., Hickerson, D., Eddhu, S.: An Efficiently-Computed Lower Bound on the Number of Recombinations in Phylogenetic Networks: Theory and Empirical Study. Discrete Applied Math. 155, 806–830 (2007)
Hudson, R.: Generating Samples under the Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)
Hudson, R., Kaplan, N.: Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985)
Myers, S.: The detection of recombination events using DNA sequence data, PhD dissertation. Dept. of Statistics, University of Oxford, Oxford, England (2003)
Myers, S.R., Griffiths, R.C.: Bounds on the minimum number of recombination events in a sample history. Genetics 163, 375–394 (2003)
Song, Y.S., Ding, Z., Gusfield, D., Langley, C., Wu, Y.: Algorithms to distinguish the role of gene-conversion from single-crossover recombination in the derivations of SNP sequences in populations. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, Springer, Heidelberg (2006)
Song, Y.S., Wu, Y., Gusfield, D.: Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences. Bioinformatics 421, i413–i422 (2005) Proceedings of ISMB 2005
Wang, L., Zhang, K., Zhang, L.: Perfect Phylogenetic Networks with Recombination. J. of Comp. Bio. 8, 69–78 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, Y., Gusfield, D. (2007). A New Recombination Lower Bound and the Minimum Perfect Phylogenetic Forest Problem. In: Lin, G. (eds) Computing and Combinatorics. COCOON 2007. Lecture Notes in Computer Science, vol 4598. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73545-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-73545-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73544-1
Online ISBN: 978-3-540-73545-8
eBook Packages: Computer ScienceComputer Science (R0)