Skip to main content

iGAPK: Improved GAPK Algorithm for Regulatory DNA Motif Discovery

  • Conference paper
Neural Information Processing. Models and Applications (ICONIP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6444))

Included in the following conference series:

Abstract

Computational DNA motif discovery is one of the major research areas in bioinformatics, which helps to understanding the mechanism of gene regulation. Recently, we have developed a GA-based motif discovery algorithm, named as GAPK, which addresses the use of some identified transcription factor binding sites extracted from orthologs for algorithm development. With our GAPK framework, technical improvements on background filtering, evolutionary computation or model refinement will contribute to achieving better performances. This paper aims to improve the GAPK framework by introducing a new fitness function, termed as relative model mismatch score (RMMS), which characterizes the conservation and rareness properties of DNA motifs simultaneously. Other technical contributions include a rule-based system for filtering background data and a “most one-in-out” (MOIO) strategy for motif model refinement. Comparative studies are carried out using eight benchmark datasets with original GAPK and two GA-based motif discovery algorithms, GAME and GALF-P. The results show that our improved GAPK method favorably outperforms others on the testing datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005)

    Article  Google Scholar 

  2. Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Science 4, 1618–1632 (1995)

    Article  Google Scholar 

  3. Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using EM. Machine Learning 21, 51–80 (1995)

    Google Scholar 

  4. Tompa, M., Li, N., Bailey, T.L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnology 23, 137–144 (2005)

    Article  Google Scholar 

  5. Bailey, T.L., Elkan, C.P.: The value of prior knowledge in discovering motifs with MEME. Intell. Sys. Mol. Bilo. 3, 21–29 (1995)

    Google Scholar 

  6. Li, L.P., Liang, Y., Bass, R.L.L.: GAPWM: a genetic algorithm method for optimizing a position weight matrix. Bioinformatics 23, 1188–1194 (2007)

    Article  Google Scholar 

  7. Wang, T., Stormo, G.D.: Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics 19, 2369–2380 (2003)

    Article  Google Scholar 

  8. Narang, V., Mittal, A., Sung, W.-K.: Localized motif discovery in gene regulatory sequences. Bioinformatics 26, 1152–1159 (2010)

    Article  Google Scholar 

  9. Wei, Z., Jensen, S.T.: GAME: detecting cis-regulatory elements using a genetic algorithm. Bioinformatics 22, 1577–1584 (2006)

    Article  Google Scholar 

  10. Chan, T.-M., Leung, K.-S., Lee, K.-H.: TFBS identification based on genetic algorithm with combined representations and adaptive post-processing. Bioinformatics 24, 341–349 (2008)

    Article  Google Scholar 

  11. Wang, D.H., Li, X.: GAPK: Genetic algorithms with prior knowledge for motif discovery in DNA sequences. In: CEC 2009: IEEE Congress on Evolutionary Computation 2009, Trondheim, Norway, pp. 277–284 (2009)

    Google Scholar 

  12. Wang, D.H., Lee, N.K.: MISCORE: mismatch-based matrix similarity scores for DNA motif detection. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008. LNCS, vol. 5506, pp. 478–485. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Wang, D.H.: Characterization of regulatory motif models. Technical Report, La Trobe University, Australia (October 2009)

    Google Scholar 

  14. Stormo, G.D., Fields, D.S.: Specificity, free energy and information content in protein-DNA interactions. Trends in Biochemical Sciences 23, 109–113 (1998)

    Article  Google Scholar 

  15. Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouzé, P., Moreau, Y.: A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, D., Li, X. (2010). iGAPK: Improved GAPK Algorithm for Regulatory DNA Motif Discovery. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds) Neural Information Processing. Models and Applications. ICONIP 2010. Lecture Notes in Computer Science, vol 6444. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17534-3_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17534-3_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17533-6

  • Online ISBN: 978-3-642-17534-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics