Summary
Recent publications illustrate successful applications of belief networks1 (BNs) and related probabilistic networks in the domain of bioinformatics. Examples are the modeling of gene regulation networks [6,14,26], the discovering of metabolic [40,83] and signalling pathways [94], sequence analysis [9, 10], protein structure [16, 28, 76], and linkage analysis [55]. Belief networks are applied broadly in health care and medicine for diagnosis and as a data mining tool [57, 60, 61]. New developments in learning belief networks from heterogeneous data sources [40, 56, 67, 80, 82, 96] show that belief networks are becoming an important tool for dealing with high-throughput data at a large scale, not only at the genetic and biochemical level, but also at the level of systems biology.
In this chapter we introduce belief networks and describe their current use within bioinformatics. The goal of the chapter is to help the reader to understand and apply belief networks in the domain of bioinformatics. To achieve this, we (1) make the reader acquainted with the basic mathematical background of belief networks, (2) introduce algorithms to learn and to query belief networks, (3) describe the current state-of-the-art by discussing several real-world applications in bioinformatics, and (4) discuss (free and commercially) available software tools.
The chapter is organized as follows. We start (in Section 3.1) with introducing the concept of belief networks. Then (in Section 3.2) we present some basic algorithms to infer on belief networks and to learn belief networks from data. Section 3.3 is dedicated to a (non-exhaustive) range of extensions to and variants of the standard belief-network concept. We continue (in Section 3.4) by discussing some techniques and guidelines to construct belief networks from domain knowledge. Section 3.5 reviews some recent applications of belief networks in the domain of bioinformatics. In Section 3.6 we discuss a range of tools that are available for constructing, querying, and learning belief networks. Finally, (in Section 3.7) we provide a brief guide to the literature on belief networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Silvia Acid and Luis M. de Campos. A hybrid methodology for learning belief networks. Int. J. Approx. Reasoning, 27(3):235–262, 2001.
Cornelis A. Albers, Martijn A.R. Leisink, and Hilbert J. Kappen. The cluster variation method for efficient linkage analysis on extended pedigrees. BMC Bioinformatics, 7 suppl 1:xx, 2006.
Ethem Alpaydin. Introduction to Machine Learning. The MIT Press, 2004.
S.K. Andersen, K.G. Olesen, F.V. Jensen, and F. Jensen. Hugin - a shell for building belief universes for expert systems. In Proceedings IJCAI, pages 1080–1085, 1989.
Pierre Baldi and Søren Brunak. Bioinformatics the Machine Learning Approach, 2nd edition. MIT Press, Cambridge, MA, 2001.
Katia Basso, Adam A. Margolin, Gustavo Stolovitzky, Ulf Klein, Ricardo Dalla-Favera, and Andrea Califano. Reverse engineering of regulatory networks in humann B cells. Nature Genetics, 37(4):382–390, 2005.
Alexis Battle, Eran Segal, and Daphe Koller. Probabilistic discovery of overlapping cellular processes and their regulation. Computational Biology, 12(7):909–927, 2005.
L.E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann. Math. Statist, 41(1):164–171, 1970.
I. Ben-Gal, A. Shani, A. Gohr, J. Grau, S. Arviv, A. Shmilovici, S. Posch, and I. Grosse. Identification of transcription factor binding sites with variable-order bayesian networks. Bioinformatics, 21(11):2657–2666, 2005.
Joseph Bockhorst, Mark Craven, David Page, Jude Shavlik, and Jeremy Glasner. A bayesian network approach to operon prediction. Bioinformatics, 19(10):1227–1235, 2003.
Craig Boutilier, Nir Friedman, Mosies Goldszmidt, and Daphne Koller. Context-specific independence in bayesian networks. In UAI96, pages 115–123, 1996.
J. Boyan. Learning evaluation functions for global optimization. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 1998.
Mark Chavira, Adnan Darwiche, and Manfred Jaeger. Compiling relational bayesian networks for exact inference. Int. Journal of Approximate Reasoning, 42:4–20, 2006.
Xue-Wen Chen, Gopalakrishna Anantha, and Xinhun Wang. An effective structure learning method for constructing gene networks. Bioinformatics, 22(11):1367–1374, 2006.
David Maxwell Chickering. Learning equivalence classes of bayesian network structures. Journal of Machine Learning Research, 2:445–498, 2002.
Wei Chu and Zoubin Ghahramani. Protein secondary structure prediction using sigmoid belief networks to parameterize segmental semi-markov fields. In Proceedings of the European Symposium on Artificial Neural Networks ESANN’2004, pages 81–86, Bruges, Belgium, 2004.
Gregory F. Cooper. Probabilistic inference using belief networks is NP-hard. Artificial Intelligence, 42:393–405, 1990.
Gregory F. Cooper. A bayesian method for learning belief networks that contain hidden variables. Journal of Intelligent Information Systems, 4:71–88, 1995.
Gregory F. Cooper and E. Herskovits. A bayesian method for the induction of probabilistc networks from data. Machine Learning, 9:309–347, 1992.
Paulo C.G. Costa, Kathryn B. Laskey, and Kenneth J. Laskey. Pr-owl: A framework for probabilistic ontologies. In Proceedings of the Fourth International Conference on Formal Ontology in Information Systems, 2006.
Robert G. Cowell. Conditions under which conditional independence and scoring methods lead to identical selection of bayesian network models. In UAI01, 2001.
Robert G. Cowell, A. Philip Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer Verlag, Berlin, New York, 2003.
Adnan Darwiche. A differential approach to inference in bayesian networks. In UAI2000, pages 123–132, 2000.
Adnan Darwiche. New advances in compiling cnf to decomposable negation normal form. In ECAI 2004, 2004.
Denver Dash and Marek J. Druzdzel. A hybrid anytime algorithm for the construction of causal models from sparse data. In UAI99, 1999.
Hidde de Jong. Modeling and simulation of genetic regulatory systems: A literature overview. Computational Biology, 9(1):67–103, 2002.
R. Dechter. Bucket elemination: a unifying framework for probabilistic inference. In UAI96, 1996.
Arthur L. Delcher, Simon Kasif, Harry R. Goldberg, and William H. Hsu. Protein secondary structure modelling with probabilistic networks. In Proceedings of the International Conference on Intelligent Systems and Molecular Biology, pages 109–117, 1993. (extended abstract).
H.H.L.M. Donkers, A.W. Werten, J.W.H.M. Uiterwijk, and H.J. van den Herik. Sequapro: A tool for semi-qualitative decision making. Technical Report CS 01-06, Department of Computer Science, Universiteit Maastricht, 2001.
Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, UK, 1999.
Sean R. Eddy. Profile hidden markov models. Bioinformatics, 14(9):755–763, 1998.
G. Elidan, N. Lotner, N. Friedman, and D. Koller. Discovering hidden variables: A structure-based approach. In Advances in Neural Information Processing Systems (NIPS), 2000.
M. Fichelson and D. Geiger. Exact genetic linkage computations for general pedigrees. Bioinformatics, 18 Suppl. 1:S189–S198, 2002.
Ma’ayan Fichelson and Dan Geiger. Optimizing exact genetic link computations. Journal of Computational Biology, 11(2–3):263–275, 2004.
Nir Friedman. Learning belief networks in the presence of missing values and hidden variables. In Proceedings of the Fourteenth International Conference on Machine Learning, 1997.
Nir Friedman. The bayesian structural em algorithm. In UAI98, 1998.
Nir Friedman. Inferring cellular networks using probabilistic graphical networks. Science, 303:799–805, 2004.
Nir Friedman, Michal Linial, Iftach Nachman, and Dana Pe-er. Using bayesian networks to analyze expression data. Computational Biology, 7(3/4):601–620, 2000.
Nir Friedman, Iftach Nachman, and Dana Pe’er. Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proc. Fifteenth Conf. on Uncertainty in Artificial Intelligence (UAI), 1999.
Irit Gat-Viks, Amos Tanay, Daniela Raijman, and Ron Shamir. A probabilistic methodology for integrating knowledge and experiments on biological networks. Computational Biology, 13(2):165–181, 2006.
D. Griffeath. Introduction to markov random fields. In Kemeny, Knapp, and Snell, editors, Denumerable Markov Chains. Springer, 1976. 2nd edition.
Chris Harbron. Heuristic algorithms for finding inexpensive elimination schemes. Statistics and Computing, 5:275–287, 1995.
David Heckerman. A tutorial on learning with bayesian networks. Technical report, Microsoft Research, 1995.
Manfred Jaeger. Relational bayesian nets. In D. Geiger and P.P. Shenoy, editors, Uncerrtainty in Artificial Intelligence (UAI97), pages 266–273, San Fransisco, CA, 1997. Morgan Kaufman Publishers.
Manfred Jaeger. Relational bayesian networks: a survey. Electronic Transactions in Artificial Intelligence, 6:xx, 2002.
Manfred Jeager. The Primula System: user’s guide, 2006.http://www.cs.aau.dk/ ˜jeager/primula.
Claus Skaanning Jensen and Augustine Kong. Blocking gibbs sampling for linkage analysis in large pedigrees with many loops. American Journal of Human Genetics, 65:885–901, 1999.
Finn V. Jensen. Bayesian Networks and Decision Graphs. Springer Verlag, New York, Berlin, 2001.
M.I. Jordan. Learning in Graphical Models. MIT Press, Cambridge, MA, 1998.
Kevin B. Korb and Ann E. Nicholson. Bayesian Artificial Intelligence. Chapman & Hall/CRC, Boca Raton, FL, 2004.
Timo Koski. Hidden Markov Models of Bioinformatics. Springer Verlag, Berlin, New York, 2001.
W. Lam and F. Bacchus. Learning bayesian belief networks: An approach based on the mdl principle. Computational Intelligence, 10:269–293, 1994.
Kathryn B. Laskey. MEBN: A logic for open-world probabilistic reasoning. Technical Report C4I06-01, George Mason University C4I Center, 2006.
S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B, 50(2):157–224, 1988.
Steffen L. Lauritzen and Nuala Sheehan. Graphical models for genetic analysis. Statistical Science, 18(4):489–514, 2003.
Phil Hyoun Lee and Doheon Lee. Modularized learning of genetic interaction networks from biological annotations and mRNA expression data. Bioinformatics, 21(11):2739–2747, 2005.
Sun-Mi Lee and Patricia A. Abbott. Bayesian networks for knowledge discovery in large datasets: basics for nurse researchers. Journal of Biomedical Informatics, 36(4–5):389–399, 2003.
M. Leisink, H.J. Kappen, and H.G. Brunner. Linkage analysis: A bayesian approach. In ICANN 2002, number 2415 in LNCS, pages 595–600, 2002.
Arthur M. Lesk. Introduction to Bioinformatics. Oxford University Press, New York, NY, 2005.
Peter J.F. Lucas. Bayesian analysis, pattern analysis and data mining in health care. Current Opinion in Critical Care, 10:399–403, 2004.
Peter J.F. Lucas. Bayesian networks in biomedicine and health-care. Artificial Intelligence in Medicine, 30:201–214, 2004.
Suzanne M. Mahony and Kathryn B. Laskey. Network engineering for complex belief networks. In UAI97, 1997.
Kristin Missal, Michael A. Cross, and Dirk Drasdo. Gene network inference from incomplete expression data: Transcriptional control of hematopoietic commitment. Bioinformatics, 22(6):731–738, 2006.
Kevin Murphy. Dynamic Bayesian Networks: Representation, Inference and Learning. PhD thesis, UC Berkeley, Computer Science Division, 2002.
R.E. Neapolitan. Learning Bayesian Networks. Prentice Hall, Pearson Education, Upper Saddle River, NJ, 2003.
R.M. Oliver and J.Q. Smith, editors. Influence Diagrams, Belief Nets and Decision Analysis. John Wiley & Sons, Chichester, UK, 1990.
David Page and Mark Craven. Biological applications of multi-relational data mining. SIGKDD Explorations, 5-1:69–79, 2003.
Judea Pearl. Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence, 32:245–257, 1987.
Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publishers, San Mateo, CA, 1988.
Judea Pearl and T. Verma. A theory of inferred causation. In Proceedings of Principles of Knowledge Representation and Reasoning, 1991.
J.M. Peña, J. Björkegen, and J. Tegnér. Growing bayesian network models of gene networks from seed genes. Bioinformatics, 21(Suppl. 2):ii224–ii229, 2005.
Bruno-Eduoard Perrin, Liva Ralaivola, Aurélien Mazurie, Samuele Bottani, Jacques Mallet, and Florence d’Alché Buc. Gene networks inference using dynamic bayesian networks. Bioinformatics, 19(Suppl. 2):ii138–ii148, 2003.
Antonio Piccolboni and Dan Gusfield. On the complexity of fundamental computational problems in pedigree analysis. Journal of Computational Biology, 10(5):763–773, 2003.
Rainer Pudimat, Ernst-Günter Schukat-Talamazzini, and Rolf Backofen. A multiple-feature framework for modelling and predicting transcription factor binding sites. Bioinformatics, 21(14):3082–3088, 2005.
Yuan Qi, Alex Rolfe, Kenzie D. MacIsaac, Georg K. Gerber, Dmitry Pokholok, Julia Zeitlinger, Timothy Danford, Robin D. Dowell, Ernest Fraenkel, Tommi S. Jaakkola, Richard A. Young, and David K. Gifford. High-resolution computational models of genome binding events. Nature Biotechnology, 24:963–970, 2006.
A. Raval, Z. Ghahramani, and D.L. Wild. A bayesian network model for protein fold and remote homologue recognition. Bioinformatics, 18(6):788–801, 2002.
Silja Renooij and Linda C. van der Gaag. From qualitative to quantitative probabilistic networks. In UAI02, pages 442–429, 2002.
Carsten Riggelsen. Approximation Methods for Efficient Learning of Bayesian Networks. PhD thesis, Universiteit Utrecht, 2006.
Stuart Russel and Peter Norvig. Artificial Intelligence, a modern approach. Prentice Hall, Pearson Education, Upper Saddle River, NJ, 2nd edition, 2003.
Eran Segal. Rich Probabilistic Models for Genomic Data. PhD thesis, Stanford University, 2004.
Eran Segal, Nir Friedman, Daphne Koller, and Aviv Regev. A module map showing conditional activity of expression modules in cancer. Nature Genetics, 36(10):1090–1098, 2004.
Eran Segal, Ben Taskar, Audrey Gash, Nir Friedman, and Daphne Koller. Rich probabilistic models for gene expression. Bioinformatics, 17(Suppl. 1):s234–s252, 2001.
Eran Segal, H. Wang, and Daphe Koller. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics, 19(Suppl. 1):i264–i272, 2003.
G. Shafer and P. Shenoy. Probability propagation. Annals of Mathematics and Artificial Intelligence, 2:327–352, 1990.
A Siepel and D. Haussler. Phylogenetic hidden markov models. In R. Nielsen, editor, Statistical Methods in Molecular Evolution, pages 325–351, New York, 2005. Springer.
M. Silberstein, A. Tzemach, N. Dovgolevsky, M. Fichelson, A Shuster, and D. Geiger. Online system for faster multipoint linkage analysis via parallel execution on thousands of personal computers. American Journal of Human Genetics, 78:992–935, 2006.
D.J. Speigelhalter, P. Dawid, S. Lauritzen, and R. Cowell. Bayesian analysis in expert systems. Statistical Science, 8:219–282, 1993.
P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction and Search. MIT Press, Cambridge, MA, 2000.
Alun Thomas, Alexander Gutin, Victor Abkevich, and Aruna Bansal. Multilocas linkage analysis by blocked gibbs sampling. Statistics and Computing, 10:259–269, 2000.
Olga G. Troyanskaya, Kara Dolinski, Art B. Owen, Russ B. Altman, and David Botstein. A bayesian framework for combining heterogeneous data sources for gene function prediction in saccharomyces cerevisiae. In PNAS, volume 100, pages 8348–8353, 2003.
Linda C. van der Gaag, Silja Renooij, Cees L.M. Witteman, B.M.P. Aleman, and B.G. Taal. How to elecit many probabilities. In UAI99, 1999.
Claudio J. Verzilli, Nigel Stallard, and John C. Whittaker. Bayeisan graphical models for genomewide association studies. American Journal of Human Genetics, 79:100–112, 2006.
Michael P. Wellman. Fundamental concepts of qualitative probabilistic networks. Artificial Intelligence, 44(3):257–303, 1990.
Peter J. Woolf, Wendy Prudhomme, Laurence Daheron, George Q. Daley, and Douglas A. Lauffenburger. Bayesian analysis of signalling networks governing embryonic stem cell fate decisions. Bioinformatics, 21(6):741–753, 2005.
Chen-hsiang Yeang and Tommi Jaakkola. Physical networks and multi-source data integration. In Proceedings of the 7th annual International Conference on Research in Computational Molecular Biology, 2003.
Jing Yu, V. Anne Smith, Paul P. Wang, Alexander J. Hartemink, and Erich D. Jarvis. Advances to bayesian network inference for generating causal networks from observational biological data. Bioinformatics, 20(18):3594–3603, 2004.
Geoffrey Zweig and Stuart J. Russell. Speech recognition with dynamic bayesian networks. In AAAI/IAAI, pages 173–180, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Donkers, J.H.H.L.M., Tuyls, K. (2008). Belief Networks for Bioinformatics. In: Kelemen, A., Abraham, A., Chen, Y. (eds) Computational Intelligence in Bioinformatics. Studies in Computational Intelligence, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76803-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-76803-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76802-9
Online ISBN: 978-3-540-76803-6
eBook Packages: EngineeringEngineering (R0)