On the Consistency of Discrete Bayesian Learning

Poland, Jan

doi:10.1007/978-3-540-70918-3_35

Jan Poland¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4393))

Included in the following conference series:

Annual Symposium on Theoretical Aspects of Computer Science

1119 Accesses

Abstract

This paper accomplishes the last step in a series of consistency theorems for Bayesian learners based on discree hypothesis class, being initiated by Solomonoff’s 1978 work. Precisely, we show the generalization of a performance guarantee for Bayesian stochastic model selection, which has been proven very recently by the author for finite observation space, to countable and continuous observation space as well as mixtures. This strong result is (to the author’s knowledge) the first of this kind for stochastic model selection. It states almost sure consistency of the learner in the realizable case, that is, where one of the hypotheses/models considered coincides with the truth. Moreover, it implies error bounds on the difference of the predictive distribution to the true one, and even loss bounds w.r.t. arbitrary loss functions. The set of consistency theorems for the three natural variants of discrete Bayesian prediction, namely marginalization, MAP, and stochastic model selection, is thus being completed for general observation space. Hence, this is the right time to recapitulate all these results, to present them in a unified context, and to discuss the different situations of Bayesian learning and its different methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Poland, J.: The missing consistency theorem for Bayesian learning: Stochastic model selection. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 259–273. Springer, Heidelberg (2006)
Chapter Google Scholar
Blackwell, D., Dubins, L.: Merging of opinions with increasing information. Annals of Mathematical Statistics 33, 882–887 (1962)
Article MathSciNet Google Scholar
Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inform. Theory 36, 453–471 (1990)
Article MATH MathSciNet Google Scholar
Barron, A.R., Rissanen, J.J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory 44, 2743–2760 (1998)
Article MATH MathSciNet Google Scholar
Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11, 185–194 (1968)
MATH Google Scholar
Wallace, C.S., Dowe, D.L.: Minimum Message Length and Kolmogorov Complexity. Computer Journal 42, 270–283 (1999)
Article MATH Google Scholar
Rissanen, J.J.: Fisher Information and Stochastic Complexity. IEEE Trans. Inform. Theory 42, 40–47 (1996)
Article MATH MathSciNet Google Scholar
Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. Inform. Theory 37, 1034–1054 (1991)
Article MathSciNet Google Scholar
Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2004)
Google Scholar
Poland, J., Hutter, M.: Asymptotics of discrete MDL for online prediction. IEEE Transactions on Information Theory 51, 3780–3795 (2005)
Article MathSciNet Google Scholar
Grünwald, P., Langford, J.: Suboptimal behaviour of Bayes and MDL in classification under misspecification. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 331–347. Springer, Heidelberg (2004)
Google Scholar
Comley, J.W., Dowe, D.L.: Minimum message length and generalized Bayesian nets with asymmetric languages. In: Grünwald, P., Myung, I.J., Pitt, M.A. (eds.) Advances in Minimum Description Length: Theory and Applications, pp. 265–294 (2005)
Google Scholar
Solomonoff, R.J.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Trans. Inform. Theory 24, 422–432 (1978)
Article MATH MathSciNet Google Scholar
Poland, J., Hutter, M.: MDL convergence speed for Bernoulli sequences. Statistics and Computing 16, 161–175 (2006)
Article MathSciNet Google Scholar
Borovkov, A.A., Moullagaliev, A.: Mathematical Statistics. Gordon & Breach, Newark (1998)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Technology, Hokkaido University, Japan
Jan Poland

Authors

Jan Poland
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Wolfgang Thomas Pascal Weil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Poland, J. (2007). On the Consistency of Discrete Bayesian Learning. In: Thomas, W., Weil, P. (eds) STACS 2007. STACS 2007. Lecture Notes in Computer Science, vol 4393. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70918-3_35

Download citation

DOI: https://doi.org/10.1007/978-3-540-70918-3_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70917-6
Online ISBN: 978-3-540-70918-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics