next up previous contents
Next: Probability Density Model for Up: Computational Information Technology Previous: Computational Information Technology

Bayesian Methods for Neural Networks

Researchers: Jouko Lampinen, Aki Vehtari, Paula Litkey, Ilkka Kalliomäki, Simo Särkkä, Tomas Martinsen, and Jani Lahtinen

The project is part of the PROMISE (Applications of Probabilistic Modelling and Search Methods) project funded by TEKES and participating industries.

The goal of the project is to study Bayesian methods for neural networks. In classification and non-linear function approximation neural networks have become very popular in recent years. The main difficulty with neural networks is in controlling the complexity of the model. Also, tools for analyzing the results (such as confidence intervals) are rather primitive. The Bayesian approach provides efficient tools for model selection and analysis, and is based on a consistent way to do inference by combining the evidence from data to prior knowledge from the problem.

In the project we develop methods for using more general prior distributions for model parameters and noise models than currently available. Examples of recent results are non-Gaussian noise models with arbitrary correlations between model outputs, outlier tolerant noise models with adaptable tailness, and model performance assessment and model choice using cross-validation predictive densities and Bayesian bootstrap methods.

We have used the Bayesian networks in a number of applications. To summarize the results, the Bayesian neural networks perform consistently better in both regression or classification problems than other statistical estimation methods. In case of low performance it is usually easy to point out the statistical assumptions that are invalid in the case under scrutiny, and to build feasible prior distributions into the model.

Below is an example from a case project wehre we developed a model for predicting the quality properties of concrete. The model contained 19 input variables related to the properties of the stone material (natural or crushed, size and shape distributions of the grains, mineralogical composition), additives, and the amount of cement and water. The quality variables contained various compression strengths, densities, air-%, bleeding, spreading and slumping of the concrete. Figure 1 shows the dependence of the slump of one of the grain shape measurements.


  
Figure 1: A concrete example of Bayesian neural model for concrete quality. The figure shows the dependence of slump from shape measurement of 0.8-1.0 mm grains. The 18 input variables not shown have fixed values. The red line shows the expected value of slump given shape 0.8-1.0 and the blue lines show the 90% confidence limits. The gray level shows the posterior probability pslump | shape 0.8-1.0). Figure 1


next up previous contents
Next: Probability Density Model for Up: Computational Information Technology Previous: Computational Information Technology
www@lce.hut.fi