A Divide-and-Conquer Method for Multiple Sequence Alignment on Multi-core Computers

Zhu, Xiangyuan

doi:10.1007/978-3-642-53962-6_41

Xiangyuan Zhu⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 405))

Included in the following conference series:

International Conference on Parallel Computing in Fluid Dynamics

3496 Accesses

Abstract

A parallel Divide-and-Conquer Alignment procedure (DCA) for multiple sequence alignment is presented. DCA improves alignment speed by using the Divide-and-Conquer paradigm, which is suitable for handling large-scale processing problems on multi-core computers. DCA works by dividing the large-scale alignment problem into smaller and more tractable sub-problems which can be solved by the existing algorithms. We assess the execution time and accuracy of our implementation of DCA on an 8-core computer using the classical benchmarks, BAliBASE, PREFAB, IRMBase and OXBENCH, and twenty-eight artificially generated test sets. DCA achieves up to 111-fold improvements in execution time with comparable accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Orobitg, M., Guirado, F., Notredame, C., Cores, F.: Exploiting parallelism on progressive alignment methods. The Journal of Supercomputing 58(2), 186–194 (2011)
Article Google Scholar
Saeed, F., Khokhar, A.: A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms. Journal of Parallel and Distributed Computing 69(7), 666–677 (2009)
Article Google Scholar
Kim, T., Joo, H.: ClustalXeed: A GUI-based grid computation version for high performance and terabyte size multiple sequence alignment. BMC Bioinformatics 11(1), 467 (2010)
Article Google Scholar
Lloyd, S., Snell, Q.O.: Accelerated large-scale multiple sequence alignment. BMC Bioinformatics 12(1), 1–10 (2011)
Article Google Scholar
Liu, W., Schmidt, B., Muller-Wittig, W.: CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 8(6), 1678–1684 (2011)
Article Google Scholar
Wirawan, A., Kwoh, C.K., Schmidt, B.: Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures. Bioinformatics 26(10), 1368–1369 (2010)
Article Google Scholar
Sarkar, S., Kulkarni, G.R., Pande, P.P., Kalyanaraman, A.: Network-on-Chip hardware accelerators for biological sequence alignment. IEEE Transactions on Computers 59(1), 29–41 (2010)
Article MathSciNet Google Scholar
Zhu, X., Li, K., Li, R.: A Data Parallel Strategy for Aligning Multiple Biological Sequences on Homogeneous Multiprocessor Platform. In: 2011 Sixth Annual Chinagrid Conference (ChinaGrid), pp. 188–195. IEEE (2011)
Google Scholar
Zhu, X., Li, K., Salah, A.: A data parallel strategy for aligning multiple biological sequences on multi-core computers. Computers in Biology and Medicine 43(4), 350–361 (2013)
Article Google Scholar
Edgar, R.: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1), 113 (2004)
Article Google Scholar
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)
Article Google Scholar
Katoh, K., Frith, C.M.: Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics Applications Note 28(23), 3144–3146 (2012)
Article Google Scholar
Thompson, J.D., Koehl, P., Ripp, R., Poch, O.: BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics 61(1), 127–136 (2005)
Article Google Scholar
Raghava, G., Searle, S.M., Audley, P.C., Barber, J.D., Barton, G.J.: OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 4(1), 47 (2003)
Article Google Scholar
Subramanian, A.R., Kaufmann, M., Morgenstern, B., et al.: DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol. Biol. 3(6) (2008)
Google Scholar
Sahraeian, S.M.E., Yoon, B.J.: PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple sequences. Nucleic Acids Research 38(15), 4917–4928 (2010)
Article Google Scholar
Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research 15(2), 330–340 (2005)
Article Google Scholar
Pei, J., Grishin, N.V.: MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Research 34(16), 4364–4374 (2006)
Article Google Scholar
Notredame, C., Higgins, D.G., Heringa, J., et al.: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302(1), 205–218 (2000)
Article Google Scholar
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22), 4673–4680 (1994)
Article Google Scholar
Roshan, U., Livesay, D.R.: Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22(22), 2715–2721 (2006)
Article Google Scholar
Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families. Bioinformatics 14(2), 157–163 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Zhaoqing University, Zhaoqing, 526061, China
Xiangyuan Zhu

Authors

Xiangyuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Information Science and Engineering, Hunan University, 410082, Changsha, China
Kenli Li
College of Information Science and Engineering, Hunan University, #2, South Lushan Road, Yuelu District, 410082, Changsha, China
Zheng Xiao & Jiayi Du &
College of Information Science and Engineering, Northeastern University, 110004, Shenyang, China
Yan Wang
Hunan University, State University of New York at New Paltz,, 12561, New Paltz, NY, USA
Keqin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, X. (2014). A Divide-and-Conquer Method for Multiple Sequence Alignment on Multi-core Computers. In: Li, K., Xiao, Z., Wang, Y., Du, J., Li, K. (eds) Parallel Computational Fluid Dynamics. ParCFD 2013. Communications in Computer and Information Science, vol 405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53962-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-53962-6_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53961-9
Online ISBN: 978-3-642-53962-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics