Bayesian Statistical Methods

Centre of Excellence in Computational Complex Systems Research

 

Bayesian Model Assessment and Selection Using Expected Utilities

Researchers: Aki Vehtari, Jouko Lampinen, Janne Ojanen

The goal of the project is to study theoretically justified and computationally practical methods for the Bayesian model assessment, comparison, and selection. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities, that is, the relative values of consequences. We synthesize and extend the previous work in several ways. We give a unified presentation from the Bayesian viewpoint emphasizing the assumptions made and propose practical methods to obtain the distributions of the expected utility estimates.

The reliability of the estimated expected utility can be assessed by estimating its distribution. The distributions of the expected utilities can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. The expected utilities take into account how the model predictions are going to be used and thus may reveal that even the best model selected may be inadequate or not practically better than the previously used models.

The developed methods have already been applied with great success in model assessment and selection in real world concrete quality modeling case in cooperation with Lohja Rudus. By using the models and conclusions based on them made by the concrete expert it is, e.g., possible to achieve 5-15 % savings in material costs in concrete factory.

Below is another example from a case project where one subgoal was a classification of the forest scene image pixels to tree and non-tree classes. The main problem in the task was the large variance in the classes. The appearance of the tree trunks varies in color and texture due to varying lighting conditions, epiphytes, and species dependent variations.In the non-tree class the diversity is much larger, containing,for example, terrain, tree branches, and sky. This diversity makes it difficult to choose the optimal features for the classification. Figure shows comparison of the expected classification accuracies for two models. Model 1 uses 84 texture and statistical features extracted from images. Model 2 uses only 18 features, selected using the methods developed in the project, from the set of all 84 features used by Model 1. Although Model 2 is simpler, it has practically same expected accuracy as the Model 1.

Figure

An example of Bayesian model assessment and selection using expected utilities in forest scene classification problem. The figure shows the distributions of the expected classification accuracies for two different models classifying image pixels to tree trunks and background. Distribution describes how likely different values for the expected utility are.

See also

Cross-validation vs. DIC using stack loss data

References

  • Jarno Vanhatalo and Aki Vehtari (2009). Discussion to 'Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations' by Håvard Rue, Sara Martino and Nicolas Chopin. Journal of the Royal Statistical Society, Series B (Statistical Methodology)., 71(2):383 (Available online 6 April 2009)
  • Aki Vehtari (2007). Discussion to `Some Aspects of Bayesian Model Selection for Prediction' by Chakrabarti, A., and Ghosh, J. K.. In J. M. Bernardo, et al., editors, Bayesian Statistics 8, p. 83-84. Oxford University Press.
  • Simo Särkkä, Aki Vehtari, and Jouko Lampinen (2004). Time series prediction by Kalman smoother with cross-validated noise density. In IJCNN'2004: Proceedings of the 2004 International Joint Conference on Neural Networks, Budabest, July 2004. [*] The Winner of Time Series Prediction Competition - The CATS Benchmark (PDF)

  • Vehtari, A. and Lampinen, J. (2003). Expected utility estimation via cross-validation. In J. M. Bernardo, et al., editors, Bayesian Statistics 7, pp. 701-710. Oxford University Press. (PostScript) (PDF)

  • Vehtari, A. and Lampinen, J. (2002). Bayesian model assessment and comparison using cross-validation predictive densities. Neural Computation, 14(10):2439-2468. (PostScript) (PDF)

  • Vehtari, A. (2002). Discussion to `Bayesian measures of model complexity and fit' by Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 64(4):620. (PostScript) (PDF)