Skip to main content

Machine Learning in Computational Biology

  • Reference work entry
Encyclopedia of Database Systems

Synonyms

Data mining in computational biology; Data mining in bioinformatics; Machine learning in bioinformatics; Machine learning in systems biology; Data mining in systems biology

Definition

Advances in high throughput sequencing and “omics” technologies and the resulting exponential growth in the amount of macromolecular sequence, structure, gene expression measurements, have unleashed a transformation of biology from a data-poor science into an increasingly data-rich science. Despite these advances, biology today, much like physics was before Newton and Leibnitz, has remained a largely descriptive science. Machine learning [6] currently offers some of the most cost-effective tools for building predictive models from biological data, e.g., for annotating new genomic sequences, for predicting macromolecular function, for identifying functionally important sites in proteins, for identifying genetic markers of diseases, and for discovering the networks of genetic interactions that...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Andorf C., Dobbs D., and Honavar V. Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach. BMC Bioinform., 8:284, 2007.

    Google Scholar 

  2. Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., and Sherlock G. Gene ontology: tool for the unification of biology. Nat. Gene., 25:25–29, 2000.

    Article  Google Scholar 

  3. Baldi P. and Brunak S. Bioinformatics: the machine learning approach. MIT, Cambridge, MA, 2001.

    MATH  Google Scholar 

  4. Benson D.A., Karsch-Mizrachi I., Lipman D.J., Ostell J., and Wheeler D.L. Genbank. Nucleic Acids Res., 35D (Database issue): 21–D25, 2007.

    Google Scholar 

  5. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., and Bourne P.E. The protein data bank. Nucleic Acids Res., 28:235–242, 2000.

    Article  Google Scholar 

  6. Bishop C.M. Pattern Recognition and Machine Learning. Springer, Berlin, 2006.

    MATH  Google Scholar 

  7. Boutell M.R., Luo J., Shen X., and Brown C.M. Learning multi-label scene classification. Pattern Recogn., 37:1757–1771, 2004.

    Article  Google Scholar 

  8. Bruggeman F.J. and Westerhoff H.V. The nature of systems biology. Trends Microbiol., 15:15–50, 2007.

    Article  Google Scholar 

  9. Caragea C., Sinapov J., Dobbs D., and Honavar V. Assessing the performance of macromolecular sequence classifiers. In Proc. IEEE 7th Int. Symp. on Bioinformatics and Bioengineering, 2007, pp. 320–326.

    Google Scholar 

  10. de Jong H. Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol., 9:67–103, 2002.

    Article  Google Scholar 

  11. Diettrich T.G. Ensemble methods in machine learning. Springer, Berlin, In Proc. 1st Int. Workshop on Multiple Classifier Systems, 2000, pp. 1–15.

    Google Scholar 

  12. Diettrich T.G. Machine learning for sequential data: a review. In Proc. Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, 2002, pp. 15–30.

    Google Scholar 

  13. El-Manzalawy Y., Dobbs D., and Honavar V. On evaluating MHC-II binding peptide prediction methods, PLoS One, 3(9):e3268, 2008.

    Google Scholar 

  14. El-Manzalawy Y., Dobbs D., and Honavar V. Predicting linear B-cell epitopes using string kernels. J. Mole. Recogn., 21243–255, 2008.

    Article  Google Scholar 

  15. Friedman N., Linial M., Nachman I., and Pe’er D. Using bayesian networks to analyze expression data. J. Comput. Biol., 7:601–620, 2000.

    Article  Google Scholar 

  16. Galperin M.Y. The molecular biology database collection: 2008 update. Nucleic Acids Res., 36:D2–D4, 2008.

    Article  Google Scholar 

  17. Guyon I. and Elisseeff A. An introduction to variable and feature selection. J. Mach. Learn. Res., 3:1157–1182, 2003.

    Article  MATH  Google Scholar 

  18. Hecker L., Alcon T., Honavar V., and Greenlee H. Querying multiple large-scale gene expression datasets from the developing retina using a seed network to prioritize experimental targets. Bioinform. Biol. Insights, 2:91–102, 2008.

    Google Scholar 

  19. Jeong H., Tombor B., Albert R., Oltvai Z.N., and Barabasi A.-L. The large-scale organization of metabolic networks. Nature, 407:651–654, 1987.

    Google Scholar 

  20. Lahdesmaki H., Shmulevich I., and Yli-Harja O. On learning gene regulatory networks under the boolean network model. Mach. Learn., 52:147–167, 2007.

    Article  Google Scholar 

  21. Terribilini M., Lee J.-H., Yan C., Jernigan R.L., Honavar V, and Dobbs D. Predicting RNA-binding sites from amino acid sequence. RNA J., 12:1450–1462, 2006.

    Article  Google Scholar 

  22. Yan C., Terribilini M., Wu F., Jernigan R.L., Dobbs D., and Honavar V. Identifying amino acid residues involved in protein-DNA interactions from sequence. BMC Bioinform., 7:262, 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Caragea, C., Honavar, V. (2009). Machine Learning in Computational Biology. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_636

Download citation

Publish with us

Policies and ethics