Bayesian Averaging is WellTemperated 
Lars Kai Hansen

Abstract  Bayesian predictions are stochastic just like predictions of any other
inference scheme that generalize from a finite sample. While a simple variational argument shows that Bayes averaging is generalization optimal given that the prior matches the teacher parameter distribution the situation is less clear if the teacher distribution is unknown. I define a class of averaging procedures, the temperated likelihoods, including both Bayes averaging with a uniform prior and maximum likelihood estimation as special cases. I show that
Bayes is generalization optimal in this family for any teacher distribution for two learning problems that are analytically tractable learning the mean of a Gaussian and asymptotics of smooth learners. 
Type  Conference paper [With referee] 
Conference  Advances in Neural Information Processing Systems 1999 
Editors  S. Solla et al. 
Year  2000 pp. 265271 
Publisher  MIT Press 
Electronic version(s)  [pdf] 
BibTeX data  [bibtex] 
IMM Group(s)  Intelligent Signal Processing 