Belief Networks for Bioinformatics

Donkers, Jeroen H. H. L. M.; Tuyls, Karl

doi:10.1007/978-3-540-76803-6_3

Jeroen H. H. L. M. Donkers⁵ &
Karl Tuyls⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 94))

1000 Accesses
2 Citations

Summary

Recent publications illustrate successful applications of belief networks¹ (BNs) and related probabilistic networks in the domain of bioinformatics. Examples are the modeling of gene regulation networks [6,14,26], the discovering of metabolic [40,83] and signalling pathways [94], sequence analysis [9, 10], protein structure [16, 28, 76], and linkage analysis [55]. Belief networks are applied broadly in health care and medicine for diagnosis and as a data mining tool [57, 60, 61]. New developments in learning belief networks from heterogeneous data sources [40, 56, 67, 80, 82, 96] show that belief networks are becoming an important tool for dealing with high-throughput data at a large scale, not only at the genetic and biochemical level, but also at the level of systems biology.

In this chapter we introduce belief networks and describe their current use within bioinformatics. The goal of the chapter is to help the reader to understand and apply belief networks in the domain of bioinformatics. To achieve this, we (1) make the reader acquainted with the basic mathematical background of belief networks, (2) introduce algorithms to learn and to query belief networks, (3) describe the current state-of-the-art by discussing several real-world applications in bioinformatics, and (4) discuss (free and commercially) available software tools.

The chapter is organized as follows. We start (in Section 3.1) with introducing the concept of belief networks. Then (in Section 3.2) we present some basic algorithms to infer on belief networks and to learn belief networks from data. Section 3.3 is dedicated to a (non-exhaustive) range of extensions to and variants of the standard belief-network concept. We continue (in Section 3.4) by discussing some techniques and guidelines to construct belief networks from domain knowledge. Section 3.5 reviews some recent applications of belief networks in the domain of bioinformatics. In Section 3.6 we discuss a range of tools that are available for constructing, querying, and learning belief networks. Finally, (in Section 3.7) we provide a brief guide to the literature on belief networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Silvia Acid and Luis M. de Campos. A hybrid methodology for learning belief networks. Int. J. Approx. Reasoning, 27(3):235–262, 2001.
Article MathSciNet Google Scholar
Cornelis A. Albers, Martijn A.R. Leisink, and Hilbert J. Kappen. The cluster variation method for efficient linkage analysis on extended pedigrees. BMC Bioinformatics, 7 suppl 1:xx, 2006.
Google Scholar
Ethem Alpaydin. Introduction to Machine Learning. The MIT Press, 2004.
Google Scholar
S.K. Andersen, K.G. Olesen, F.V. Jensen, and F. Jensen. Hugin - a shell for building belief universes for expert systems. In Proceedings IJCAI, pages 1080–1085, 1989.
Google Scholar
Pierre Baldi and Søren Brunak. Bioinformatics the Machine Learning Approach, 2nd edition. MIT Press, Cambridge, MA, 2001.
Google Scholar
Katia Basso, Adam A. Margolin, Gustavo Stolovitzky, Ulf Klein, Ricardo Dalla-Favera, and Andrea Califano. Reverse engineering of regulatory networks in humann B cells. Nature Genetics, 37(4):382–390, 2005.
Article Google Scholar
Alexis Battle, Eran Segal, and Daphe Koller. Probabilistic discovery of overlapping cellular processes and their regulation. Computational Biology, 12(7):909–927, 2005.
Article Google Scholar
L.E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann. Math. Statist, 41(1):164–171, 1970.
Article MathSciNet Google Scholar
I. Ben-Gal, A. Shani, A. Gohr, J. Grau, S. Arviv, A. Shmilovici, S. Posch, and I. Grosse. Identification of transcription factor binding sites with variable-order bayesian networks. Bioinformatics, 21(11):2657–2666, 2005.
Article Google Scholar
Joseph Bockhorst, Mark Craven, David Page, Jude Shavlik, and Jeremy Glasner. A bayesian network approach to operon prediction. Bioinformatics, 19(10):1227–1235, 2003.
Article Google Scholar
Craig Boutilier, Nir Friedman, Mosies Goldszmidt, and Daphne Koller. Context-specific independence in bayesian networks. In UAI96, pages 115–123, 1996.
Google Scholar
J. Boyan. Learning evaluation functions for global optimization. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 1998.
Google Scholar
Mark Chavira, Adnan Darwiche, and Manfred Jaeger. Compiling relational bayesian networks for exact inference. Int. Journal of Approximate Reasoning, 42:4–20, 2006.
Article MathSciNet Google Scholar
Xue-Wen Chen, Gopalakrishna Anantha, and Xinhun Wang. An effective structure learning method for constructing gene networks. Bioinformatics, 22(11):1367–1374, 2006.
Article Google Scholar
David Maxwell Chickering. Learning equivalence classes of bayesian network structures. Journal of Machine Learning Research, 2:445–498, 2002.
Article MathSciNet Google Scholar
Wei Chu and Zoubin Ghahramani. Protein secondary structure prediction using sigmoid belief networks to parameterize segmental semi-markov fields. In Proceedings of the European Symposium on Artificial Neural Networks ESANN’2004, pages 81–86, Bruges, Belgium, 2004.
Google Scholar
Gregory F. Cooper. Probabilistic inference using belief networks is NP-hard. Artificial Intelligence, 42:393–405, 1990.
Article MathSciNet Google Scholar
Gregory F. Cooper. A bayesian method for learning belief networks that contain hidden variables. Journal of Intelligent Information Systems, 4:71–88, 1995.
Article Google Scholar
Gregory F. Cooper and E. Herskovits. A bayesian method for the induction of probabilistc networks from data. Machine Learning, 9:309–347, 1992.
Google Scholar
Paulo C.G. Costa, Kathryn B. Laskey, and Kenneth J. Laskey. Pr-owl: A framework for probabilistic ontologies. In Proceedings of the Fourth International Conference on Formal Ontology in Information Systems, 2006.
Google Scholar
Robert G. Cowell. Conditions under which conditional independence and scoring methods lead to identical selection of bayesian network models. In UAI01, 2001.
Google Scholar
Robert G. Cowell, A. Philip Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer Verlag, Berlin, New York, 2003.
Google Scholar
Adnan Darwiche. A differential approach to inference in bayesian networks. In UAI2000, pages 123–132, 2000.
Google Scholar
Adnan Darwiche. New advances in compiling cnf to decomposable negation normal form. In ECAI 2004, 2004.
Google Scholar
Denver Dash and Marek J. Druzdzel. A hybrid anytime algorithm for the construction of causal models from sparse data. In UAI99, 1999.
Google Scholar
Hidde de Jong. Modeling and simulation of genetic regulatory systems: A literature overview. Computational Biology, 9(1):67–103, 2002.
Article Google Scholar
R. Dechter. Bucket elemination: a unifying framework for probabilistic inference. In UAI96, 1996.
Google Scholar
Arthur L. Delcher, Simon Kasif, Harry R. Goldberg, and William H. Hsu. Protein secondary structure modelling with probabilistic networks. In Proceedings of the International Conference on Intelligent Systems and Molecular Biology, pages 109–117, 1993. (extended abstract).
Google Scholar
H.H.L.M. Donkers, A.W. Werten, J.W.H.M. Uiterwijk, and H.J. van den Herik. Sequapro: A tool for semi-qualitative decision making. Technical Report CS 01-06, Department of Computer Science, Universiteit Maastricht, 2001.
Google Scholar
Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, UK, 1999.
Google Scholar
Sean R. Eddy. Profile hidden markov models. Bioinformatics, 14(9):755–763, 1998.
Article Google Scholar
G. Elidan, N. Lotner, N. Friedman, and D. Koller. Discovering hidden variables: A structure-based approach. In Advances in Neural Information Processing Systems (NIPS), 2000.
Google Scholar
M. Fichelson and D. Geiger. Exact genetic linkage computations for general pedigrees. Bioinformatics, 18 Suppl. 1:S189–S198, 2002.
Google Scholar
Ma’ayan Fichelson and Dan Geiger. Optimizing exact genetic link computations. Journal of Computational Biology, 11(2–3):263–275, 2004.
Article Google Scholar
Nir Friedman. Learning belief networks in the presence of missing values and hidden variables. In Proceedings of the Fourteenth International Conference on Machine Learning, 1997.
Google Scholar
Nir Friedman. The bayesian structural em algorithm. In UAI98, 1998.
Google Scholar
Nir Friedman. Inferring cellular networks using probabilistic graphical networks. Science, 303:799–805, 2004.
Article Google Scholar
Nir Friedman, Michal Linial, Iftach Nachman, and Dana Pe-er. Using bayesian networks to analyze expression data. Computational Biology, 7(3/4):601–620, 2000.
Article Google Scholar
Nir Friedman, Iftach Nachman, and Dana Pe’er. Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proc. Fifteenth Conf. on Uncertainty in Artificial Intelligence (UAI), 1999.
Google Scholar
Irit Gat-Viks, Amos Tanay, Daniela Raijman, and Ron Shamir. A probabilistic methodology for integrating knowledge and experiments on biological networks. Computational Biology, 13(2):165–181, 2006.
Article MathSciNet Google Scholar
D. Griffeath. Introduction to markov random fields. In Kemeny, Knapp, and Snell, editors, Denumerable Markov Chains. Springer, 1976. 2nd edition.
Google Scholar
Chris Harbron. Heuristic algorithms for finding inexpensive elimination schemes. Statistics and Computing, 5:275–287, 1995.
Article Google Scholar
David Heckerman. A tutorial on learning with bayesian networks. Technical report, Microsoft Research, 1995.
Google Scholar
Manfred Jaeger. Relational bayesian nets. In D. Geiger and P.P. Shenoy, editors, Uncerrtainty in Artificial Intelligence (UAI97), pages 266–273, San Fransisco, CA, 1997. Morgan Kaufman Publishers.
Google Scholar
Manfred Jaeger. Relational bayesian networks: a survey. Electronic Transactions in Artificial Intelligence, 6:xx, 2002.
Google Scholar
Manfred Jeager. The Primula System: user’s guide, 2006.http://www.cs.aau.dk/ ˜jeager/primula.
Claus Skaanning Jensen and Augustine Kong. Blocking gibbs sampling for linkage analysis in large pedigrees with many loops. American Journal of Human Genetics, 65:885–901, 1999.
Article Google Scholar
Finn V. Jensen. Bayesian Networks and Decision Graphs. Springer Verlag, New York, Berlin, 2001.
Google Scholar
M.I. Jordan. Learning in Graphical Models. MIT Press, Cambridge, MA, 1998.
Google Scholar
Kevin B. Korb and Ann E. Nicholson. Bayesian Artificial Intelligence. Chapman & Hall/CRC, Boca Raton, FL, 2004.
Google Scholar
Timo Koski. Hidden Markov Models of Bioinformatics. Springer Verlag, Berlin, New York, 2001.
Google Scholar
W. Lam and F. Bacchus. Learning bayesian belief networks: An approach based on the mdl principle. Computational Intelligence, 10:269–293, 1994.
Article Google Scholar
Kathryn B. Laskey. MEBN: A logic for open-world probabilistic reasoning. Technical Report C4I06-01, George Mason University C4I Center, 2006.
Google Scholar
S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B, 50(2):157–224, 1988.
MathSciNet Google Scholar
Steffen L. Lauritzen and Nuala Sheehan. Graphical models for genetic analysis. Statistical Science, 18(4):489–514, 2003.
Article MathSciNet Google Scholar
Phil Hyoun Lee and Doheon Lee. Modularized learning of genetic interaction networks from biological annotations and mRNA expression data. Bioinformatics, 21(11):2739–2747, 2005.
Article Google Scholar
Sun-Mi Lee and Patricia A. Abbott. Bayesian networks for knowledge discovery in large datasets: basics for nurse researchers. Journal of Biomedical Informatics, 36(4–5):389–399, 2003.
Article Google Scholar
M. Leisink, H.J. Kappen, and H.G. Brunner. Linkage analysis: A bayesian approach. In ICANN 2002, number 2415 in LNCS, pages 595–600, 2002.
Google Scholar
Arthur M. Lesk. Introduction to Bioinformatics. Oxford University Press, New York, NY, 2005.
Google Scholar
Peter J.F. Lucas. Bayesian analysis, pattern analysis and data mining in health care. Current Opinion in Critical Care, 10:399–403, 2004.
Article Google Scholar
Peter J.F. Lucas. Bayesian networks in biomedicine and health-care. Artificial Intelligence in Medicine, 30:201–214, 2004.
Article Google Scholar
Suzanne M. Mahony and Kathryn B. Laskey. Network engineering for complex belief networks. In UAI97, 1997.
Google Scholar
Kristin Missal, Michael A. Cross, and Dirk Drasdo. Gene network inference from incomplete expression data: Transcriptional control of hematopoietic commitment. Bioinformatics, 22(6):731–738, 2006.
Article Google Scholar
Kevin Murphy. Dynamic Bayesian Networks: Representation, Inference and Learning. PhD thesis, UC Berkeley, Computer Science Division, 2002.
Google Scholar
R.E. Neapolitan. Learning Bayesian Networks. Prentice Hall, Pearson Education, Upper Saddle River, NJ, 2003.
Google Scholar
R.M. Oliver and J.Q. Smith, editors. Influence Diagrams, Belief Nets and Decision Analysis. John Wiley & Sons, Chichester, UK, 1990.
Google Scholar
David Page and Mark Craven. Biological applications of multi-relational data mining. SIGKDD Explorations, 5-1:69–79, 2003.
Article Google Scholar
Judea Pearl. Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence, 32:245–257, 1987.
Article MathSciNet Google Scholar
Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publishers, San Mateo, CA, 1988.
Google Scholar
Judea Pearl and T. Verma. A theory of inferred causation. In Proceedings of Principles of Knowledge Representation and Reasoning, 1991.
Google Scholar
J.M. Peña, J. Björkegen, and J. Tegnér. Growing bayesian network models of gene networks from seed genes. Bioinformatics, 21(Suppl. 2):ii224–ii229, 2005.
Article Google Scholar
Bruno-Eduoard Perrin, Liva Ralaivola, Aurélien Mazurie, Samuele Bottani, Jacques Mallet, and Florence d’Alché Buc. Gene networks inference using dynamic bayesian networks. Bioinformatics, 19(Suppl. 2):ii138–ii148, 2003.
Google Scholar
Antonio Piccolboni and Dan Gusfield. On the complexity of fundamental computational problems in pedigree analysis. Journal of Computational Biology, 10(5):763–773, 2003.
Article Google Scholar
Rainer Pudimat, Ernst-Günter Schukat-Talamazzini, and Rolf Backofen. A multiple-feature framework for modelling and predicting transcription factor binding sites. Bioinformatics, 21(14):3082–3088, 2005.
Article Google Scholar
Yuan Qi, Alex Rolfe, Kenzie D. MacIsaac, Georg K. Gerber, Dmitry Pokholok, Julia Zeitlinger, Timothy Danford, Robin D. Dowell, Ernest Fraenkel, Tommi S. Jaakkola, Richard A. Young, and David K. Gifford. High-resolution computational models of genome binding events. Nature Biotechnology, 24:963–970, 2006.
Article Google Scholar
A. Raval, Z. Ghahramani, and D.L. Wild. A bayesian network model for protein fold and remote homologue recognition. Bioinformatics, 18(6):788–801, 2002.
Article Google Scholar
Silja Renooij and Linda C. van der Gaag. From qualitative to quantitative probabilistic networks. In UAI02, pages 442–429, 2002.
Google Scholar
Carsten Riggelsen. Approximation Methods for Efficient Learning of Bayesian Networks. PhD thesis, Universiteit Utrecht, 2006.
Google Scholar
Stuart Russel and Peter Norvig. Artificial Intelligence, a modern approach. Prentice Hall, Pearson Education, Upper Saddle River, NJ, 2nd edition, 2003.
Google Scholar
Eran Segal. Rich Probabilistic Models for Genomic Data. PhD thesis, Stanford University, 2004.
Google Scholar
Eran Segal, Nir Friedman, Daphne Koller, and Aviv Regev. A module map showing conditional activity of expression modules in cancer. Nature Genetics, 36(10):1090–1098, 2004.
Article Google Scholar
Eran Segal, Ben Taskar, Audrey Gash, Nir Friedman, and Daphne Koller. Rich probabilistic models for gene expression. Bioinformatics, 17(Suppl. 1):s234–s252, 2001.
Google Scholar
Eran Segal, H. Wang, and Daphe Koller. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics, 19(Suppl. 1):i264–i272, 2003.
Article Google Scholar
G. Shafer and P. Shenoy. Probability propagation. Annals of Mathematics and Artificial Intelligence, 2:327–352, 1990.
Article MathSciNet Google Scholar
A Siepel and D. Haussler. Phylogenetic hidden markov models. In R. Nielsen, editor, Statistical Methods in Molecular Evolution, pages 325–351, New York, 2005. Springer.
Chapter Google Scholar
M. Silberstein, A. Tzemach, N. Dovgolevsky, M. Fichelson, A Shuster, and D. Geiger. Online system for faster multipoint linkage analysis via parallel execution on thousands of personal computers. American Journal of Human Genetics, 78:992–935, 2006.
Article Google Scholar
D.J. Speigelhalter, P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219–282, 1993.
Article MathSciNet Google Scholar
P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction and Search. MIT Press, Cambridge, MA, 2000.
Google Scholar
Alun Thomas, Alexander Gutin, Victor Abkevich, and Aruna Bansal. Multilocas linkage analysis by blocked gibbs sampling. Statistics and Computing, 10:259–269, 2000.
Article Google Scholar
Olga G. Troyanskaya, Kara Dolinski, Art B. Owen, Russ B. Altman, and David Botstein. A bayesian framework for combining heterogeneous data sources for gene function prediction in saccharomyces cerevisiae. In PNAS, volume 100, pages 8348–8353, 2003.
Google Scholar
Linda C. van der Gaag, Silja Renooij, Cees L.M. Witteman, B.M.P. Aleman, and B.G. Taal. How to elecit many probabilities. In UAI99, 1999.
Google Scholar
Claudio J. Verzilli, Nigel Stallard, and John C. Whittaker. Bayeisan graphical models for genomewide association studies. American Journal of Human Genetics, 79:100–112, 2006.
Article Google Scholar
Michael P. Wellman. Fundamental concepts of qualitative probabilistic networks. Artificial Intelligence, 44(3):257–303, 1990.
Article MathSciNet Google Scholar
Peter J. Woolf, Wendy Prudhomme, Laurence Daheron, George Q. Daley, and Douglas A. Lauffenburger. Bayesian analysis of signalling networks governing embryonic stem cell fate decisions. Bioinformatics, 21(6):741–753, 2005.
Article Google Scholar
Chen-hsiang Yeang and Tommi Jaakkola. Physical networks and multi-source data integration. In Proceedings of the 7th annual International Conference on Research in Computational Molecular Biology, 2003.
Google Scholar
Jing Yu, V. Anne Smith, Paul P. Wang, Alexander J. Hartemink, and Erich D. Jarvis. Advances to bayesian network inference for generating causal networks from observational biological data. Bioinformatics, 20(18):3594–3603, 2004.
Article Google Scholar
Geoffrey Zweig and Stuart J. Russell. Speech recognition with dynamic bayesian networks. In AAAI/IAAI, pages 173–180, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Bio- and medical Informatics Competence Center (BioMICC), MICC, Universiteit Maastricht, PO Box 616, 6200, MD Maastricht, The Netherlands
Jeroen H. H. L. M. Donkers & Karl Tuyls

Authors

Jeroen H. H. L. M. Donkers
View author publications
You can also search for this author in PubMed Google Scholar
Karl Tuyls
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Neurology Buffalo Neuroimaging Analysis Center The Jacobs Neurological Institute University at Buffalo, The State University of New York, 100 High Street, Buffalo, NY, 14203, USA
Arpad Kelemen
Centre for Quantifiable Quality of Service in Communication Systems (Q2S), Centre of Excellence Norwegian University of Science and Technology, O.S. Bragstads plass 2E, N-7491, Trondheim, Norway
Ajith Abraham
School of Information Science and Engineering, Jinan University, Jiwei Road 106, Jinan, 250022, P.R. China
Yuehui Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Donkers, J.H.H.L.M., Tuyls, K. (2008). Belief Networks for Bioinformatics. In: Kelemen, A., Abraham, A., Chen, Y. (eds) Computational Intelligence in Bioinformatics. Studies in Computational Intelligence, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76803-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-76803-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76802-9
Online ISBN: 978-3-540-76803-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics