Abstract
A parallel Divide-and-Conquer Alignment procedure (DCA) for multiple sequence alignment is presented. DCA improves alignment speed by using the Divide-and-Conquer paradigm, which is suitable for handling large-scale processing problems on multi-core computers. DCA works by dividing the large-scale alignment problem into smaller and more tractable sub-problems which can be solved by the existing algorithms. We assess the execution time and accuracy of our implementation of DCA on an 8-core computer using the classical benchmarks, BAliBASE, PREFAB, IRMBase and OXBENCH, and twenty-eight artificially generated test sets. DCA achieves up to 111-fold improvements in execution time with comparable accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Orobitg, M., Guirado, F., Notredame, C., Cores, F.: Exploiting parallelism on progressive alignment methods. The Journal of Supercomputing 58(2), 186–194 (2011)
Saeed, F., Khokhar, A.: A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms. Journal of Parallel and Distributed Computing 69(7), 666–677 (2009)
Kim, T., Joo, H.: ClustalXeed: A GUI-based grid computation version for high performance and terabyte size multiple sequence alignment. BMC Bioinformatics 11(1), 467 (2010)
Lloyd, S., Snell, Q.O.: Accelerated large-scale multiple sequence alignment. BMC Bioinformatics 12(1), 1–10 (2011)
Liu, W., Schmidt, B., Muller-Wittig, W.: CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 8(6), 1678–1684 (2011)
Wirawan, A., Kwoh, C.K., Schmidt, B.: Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures. Bioinformatics 26(10), 1368–1369 (2010)
Sarkar, S., Kulkarni, G.R., Pande, P.P., Kalyanaraman, A.: Network-on-Chip hardware accelerators for biological sequence alignment. IEEE Transactions on Computers 59(1), 29–41 (2010)
Zhu, X., Li, K., Li, R.: A Data Parallel Strategy for Aligning Multiple Biological Sequences on Homogeneous Multiprocessor Platform. In: 2011 Sixth Annual Chinagrid Conference (ChinaGrid), pp. 188–195. IEEE (2011)
Zhu, X., Li, K., Salah, A.: A data parallel strategy for aligning multiple biological sequences on multi-core computers. Computers in Biology and Medicine 43(4), 350–361 (2013)
Edgar, R.: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1), 113 (2004)
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)
Katoh, K., Frith, C.M.: Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics Applications Note 28(23), 3144–3146 (2012)
Thompson, J.D., Koehl, P., Ripp, R., Poch, O.: BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics 61(1), 127–136 (2005)
Raghava, G., Searle, S.M., Audley, P.C., Barber, J.D., Barton, G.J.: OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 4(1), 47 (2003)
Subramanian, A.R., Kaufmann, M., Morgenstern, B., et al.: DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol. Biol. 3(6) (2008)
Sahraeian, S.M.E., Yoon, B.J.: PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple sequences. Nucleic Acids Research 38(15), 4917–4928 (2010)
Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research 15(2), 330–340 (2005)
Pei, J., Grishin, N.V.: MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Research 34(16), 4364–4374 (2006)
Notredame, C., Higgins, D.G., Heringa, J., et al.: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302(1), 205–218 (2000)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22), 4673–4680 (1994)
Roshan, U., Livesay, D.R.: Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22(22), 2715–2721 (2006)
Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families. Bioinformatics 14(2), 157–163 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, X. (2014). A Divide-and-Conquer Method for Multiple Sequence Alignment on Multi-core Computers. In: Li, K., Xiao, Z., Wang, Y., Du, J., Li, K. (eds) Parallel Computational Fluid Dynamics. ParCFD 2013. Communications in Computer and Information Science, vol 405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53962-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-53962-6_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53961-9
Online ISBN: 978-3-642-53962-6
eBook Packages: Computer ScienceComputer Science (R0)