Modelling complex systems to the accuracy that allows predicting the outcomes of the studied system in certain conditions poses a difficult measurement and estimation task. Purely reductionist modelling of even modestly complex systems is impossible: a litre of ideal gas would require knowing the positions and velocities of the order of particles; modelling brain functions on the level of single neurons would require knowledge of each neuron function and all the connections, which is out of reach both due to measurement and computational limitations. On the system level statistical approach, only probability distributions of the individual elements are considered, and models operate on average values or other statistical quantities. Purely probabilistic treatment of the modelling task leads to the Bayesian approach, where probability distributions are used to represent uncertainty due to stochastic elements (aleatoric probability), and also due to not knowing the actual values (epistemic probability). Consequently, the result of the analysis is the posterior distribution of the end variables, given the observed variables and prior assumptions that represent the knowledge before the data is observed. The approach requires integration over high-dimensional distributions, which has made the approach practical only for simple models and distributions. The recent increase in computing power, together with some advances in theoretical and numerical methods, has made this approach feasible in a large set of complex tasks. For example, this approach yields statistical analysis tools for artificial neural networks, which are flexible models but difficult to control and analyze by other statistical or machine learning techniques.
Our research concentrates on hierarchical Bayesian modelling, developing methods for measuring the performance of the models, and developing efficient Monte Carlo techniques. Application areas include statistical modelling problems, object recognition and computer vision, inverse problems in brain imaging, and intelligent human-machine interfaces.
Researchers: Aki Vehtari, Simo Särkkä, Jouko Lampinen, Toni Auranen, Aapo Nummenmaa, Elina Parviainen, Jarno Vanhatalo
Main research areas of the Bayesian methodology group are model assessment and the estimation of predictive performance, the elicitation and inclusion of structural information, and advanced dynamic models (see following chapters for case examples). Other important methodological research topic is how to elicit the expert knowledge and transfer it to a probabilistic model in application problems. Examples of important model concepts used are:
To be able to tackle more challenging scientific problems, it is necessary to research methods for constructing more elaborate models and elicitation of the prior knowledge from the expert of the applied research area. Complex models may have a large number of unknown parameters, for example, thousands in brain signal analysis, which will cause difficulties for computational methods in Bayesian integration. Bayesian methodology group supports applied Bayesian research in the laboratory by providing expertize in model construction and computation.
For example, methods developed in the group were used in concrete quality prediction problem in collaboration with concrete expert Dr.Tech Hanna Järvenpää (Lohja Rudus Oy). The model assessment methods had important part in describing the reliability of the predictions. Using Bayesian modeling in this challenging problem produced excellent results. By using the models and conclusions based on them made by the concrete expert, it is possible to achieve 5-15% savings in concrete factory. Furthermore, it is possible to reduce the proportion of natural gravel from 50–100% to 5-20% and thus help saving non-renewable natural resources.
Researchers: Aki Vehtari and Jouko Lampinen
Statistical and machine learning models are becoming increasingly complex, due to advances in computational methods and computer performance. This emphasizes the importance of methods for estimating the performance of the models, for comparing and choosing the model, and for predicting the usefulness of the model in the target task.
We have been developing methods for estimating and comparing complex Bayesian models, such as neural networks, and assessing (predicting) their practical performance. When dealing with complex phenomena it is reasonable to assume that all models are approximations (there is no "correct" model), in which case the only reasonable way of comparing models is to compare the consequences of using them, that is, their predictive utilities.
The ideal approach is to use external validation, where the model is used to make predictions on future data, and collected data is then compared with the predictions. Before getting new data, external validation can be approximated using the present data, with three basic approaches: analytic, asymptotic and sample re-use, all of which have been proposed decades ago, but their use in Bayesian modeling has not been widespread. Now advances in computational methods and computer performance allow more complex models and thus there is increased interest in the approaches for estimating predictive performance of the model.
Our research on model assessment is based on cross-validation (sample re-use) approximations for the predictive performance, since it has several benefits over other approaches. For example, it can be used for arbitrary likelihoods and utility functions and it does not rely on asymptotic approximations. Main contributions so far have been in theoretical and methodological advances, which provide solid framework to assess the performance of the complex models, while taking properly into account the associated uncertainties.
Important work in progress is model selection in case of large number of models and estimation of selection induced bias, which are common problems in variable selection. We assume that we have been able to construct the full model, which we think gives the best predictions given the data and our prior beliefs. Proposed method is based on Kullback-Leibler divergence from the predictive distribution of the full model to the predictive distributions of the reduced submodel. The goal is to find the simplest submodel which has a similar predictive distribution as the full model.
The results have direct applications to various industrial problems in numerous projects, with some of the models being currently in use in, for example, concrete and steel manufacturing industry.
Researchers: Simo Särkkä, Aki Vehtari and Jouko Lampinen
This project is part of the Tekes project Development of Management Systems for Infrastructure Maintenance in Infra Technology Programme. The goal of this project is to examine the probabilistic roots of the Kalman filtering theory and find ways to go beyond the classical methods by combining the advantages of the classical and modern methodology. The results of the project have applications, for example, in the following fields:
Figure 1 shows example results from the Kalman smoother prediction of the CATS benchmark time series. Figure 2 is a very simplified example of estimation of time-varying noise variance (or standard deviation) in Gaussian signal. The noise process was modeled as logarithm of Brownian process with approximately known parameters.
1 Särkkä, S., Vehtari, A., and Lampinen, J. Time series prediction by Kalman smoother with cross-validated noise density. In Proc. IJCNN 2004, 2004
Figure 1: The Kalman smoother prediction of the CATS Time Series Prediction Competition data. The results are shown for the last two gaps at 3981 – 4000 (left) and 4981 – 5000 (right). The gray line is the true signal, the dashed line is the long term prediction result, and the black line is the combined long and short term prediction result.
Figure 2: Example of estimation of time-varying noise. The signal and the estimated quantiles of noise (left) and the true and the estimated standard deviation as function of time (right).
Researchers: Simo Särkkä, Aki Vehtari and Jouko Lampinen
The goal of the project was to develop multiple tracking methods for tracking an unknown number of targets or signals in cluttered and noisy multi-sensor environment. The idea of multiple target tracking is to optimally fuse information from sensor measurements and modeled target dynamics to form the best possible estimates of states of multiple targets (e.g., positions, velocities and attributes) and their uncertainties. The models and methods used in this project were based on Bayesian filtering theory.
The main topic of the project was tracking of an unknown number of targets, which is substantially harder task than tracking a known number of targets. Because the number of visible targets changes in time, the estimation has to be done on-line and the estimation procedure cannot be simply separated into subtasks of estimation of number and estimation of states of the targets. The developed estimation method forms joint hybrid Gaussian/MC estimate of the states of unknown number of targets by Rao-Blackwellized particle filter.
The multiple target tracking methods have also applications in other fields:
Figure 3 shows the estimation result in case of unknown number of 1D signals. The Figure 4 shows the corresponding results for a 2D state estimation (tracking) problem.
Figure 3: Filtering (left) result, smoother result (middle) and the estimated number of signals (right) of 1D scenario with an unknown number of signals
Figure 4: Filtering (left) result, smoother result (middle) and the estimated number of signals (right) of 2D tracking scenario with an unknown number of targets
Researchers: Aki Vehtari, Elina Parviainen, Markus Siivola and Jouko Lampinen
The project is part of FinnWell - Healthcare technology programme. Focus of the project is to develop healthcare data analyzing systems. Goal is to create tools to aid healthcare agents (e.g. doctors and administration) to produce and evaluate regional healthcare key figures, and anticipate the expected cost effect of a treatment for a single patient or a treatment process.
In our part of the project, we develop a method for analysis of large scale patient data, in which time dependent phenomena and various hierarchical levels (a single patient, regional, hospital, healthcare region) can be taken into account. Method will be based on hierarchical temporal models. The pilot projects will be in orthopaedics and special services for the elderly.
We also develop a theme map software for representing regional healthcare key figures (e.g. mortality, diagnoses). Key figures are shown in standardized form using color codes and variation in time is shown as sequential images or movies.
Researchers: Juho Kannala, Jukka Laurila, Sami Brandt, Aki Vehtari and Jouko Lampinen
The project’s goal was to develop methods for the analysis of video sequences that are scanned by a robot moving in the sewer. The project was done in co-operation with the VTT Building and Transport and was funded by Tekes.
The work was divided into two parts: 1) Automatic detection of pipe surface defects and pipe joints. 2) Automatic reconstruction of the 3D shape of the pipe. Displaced joints and surface cracks are among the most common types of defects in a sewer pipe. Detecting the cracks is challenging because of the large variation in surface texture. We tested several line detection algorithms for crack detection in the pipe surface and joints between pipe sections. Figure 5 shows an example of pipe segment and crack candidates and joints. Post-processing included thresholding with hysteresis and by feature size and naïve Bayes classification to combine information from the location and surrounding texture. Approach was compared to crack detection made by an expert. Results showed that approach based on edge line detectors could detect obvious cracks and results were improved to previous line detector based method. However, expert could also detect faint cracks based on additional information, like trail left by water dripping through crack and crack continuation on the other side of pipe. Implementation of similar expert knowledge and more holistic approach is not however trivial.
Figure 5: Crack detection results. (a) Original image of ’unwrapped’ pipe surface. (b) Detected cracks and joints.
The information on the shape of the sewer pipe is important, because the bendings and compressions may indicate upcoming failure. In order to obtain 3D information from the video the imaging geometry of the fish-eye lens camera must be determined. We developed an accurate and easy-to-use method for the calibration of fish-eye lenses. The calibration is possible by using only one view of a planar calibration object as Figure 6 illustrates. After solving the problem of calibration, we were able to use known multiple view techniques to track points through the image sequence and to make 3D reconstruction of the sewer pipe.
Figure 6: Fish-eye lens calibration using only one view. (a) Original image. (b) The image corrected to follow the pinhole model. Straight lines are straight as they should be.
Researchers: Toni Tamminen and Jouko Lampinen
The goal of the project is to develop a system that can locate and recognize objects in a natural scene. In our approach we study model based methods using full Bayesian inference. The objects in a scene are defined by prior models that are learned from example images.
We have developed a distortion tolerant feature matching method based on probability distributions of Gabor filter responses. An object is defined as a set of locations, with associated Gabor features, and a learned prior model that defines the variations of the feature locations. The appearance and shape models are combined to produce the posterior distribution of feature locations.
For exploring the posterior distribution, we have constructed efficient MCMC samplers for drawing samples from the posterior distributions of the feature locations, mainly using Gibbs and Metropolis sampling. We have also developed a sequential Monte Carlo approach which handles multimodal posterior distributions better than MCMC samplers. This is especially important when some of the object features are occluded, as is often the case in real matching situations. Currently we are extending the matching model to multiple resolutions, which would allow the matching of objects of greatly varying sizes. Figure 7 shows an example of the sequential matching process and figure 8 illustrates matching when the target objects are occluded.
Figure 7: Sequential feature matching. The black circles mark the drawn locations of the current feature, while the green circles are the previously drawn features. The shape (yellow lines) represent the mean of the shape prior.
Figure 8: Matching in the presence of occlusion. Even though the target objects are heavily occluded, the system is able to find the approximate locations of the features.
Researchers: Timo Kostiainen and Jouko Lampinen
The goal of this work is to develop computationally efficient techniques for the division of natural colour images into meaningful segments. The results can be applied to further processing of the image, for example object recognition.
Our approach is based on statistical models for the textures of the segments. We use a probabilistic Markov chain Monte Carlo (MCMC) algorithm to determine how to divide a given image into segments. The MCMC approach requires the processing of a large number of different sample segmentations. The computational cost depends critically on the quality of these samples. In this work we develop efficient methods to generate samples for the algorithm by taking advantage of many types of relevant image information such as edges and homogeneous areas.
Advantages of this model-based, probabilistic approach are numerous. The model allows the inclusion of different texture models as well as new methods for producing proposal samples. The model provides a flexible framework for scene analysis, since it gives a probabilistic explanation to each part of the image.
We have applied the results to a robot navigation problem, where a vision system is used for real-time path planning. We are also extending the work to take into account properties of specific types of objects.
Figure 9: Examples of segmentation results.
Gabor filters are information-theoretically optimal oriented bandpass filters which have been traditionally used in pattern recognition as a generic framework for the representation of visual images. Gabor-based features are widely used in face recognition, for example. Neurological studies have found Gabor-type structures on the visual cortex of mammals. This fact suggests that the Gabor representation is an efficient one in pattern recognition tasks.
Steerable filters are another variety of 2D oriented filters. While non-optimal in terms of joint space-frequency uncertainty principle, steerable filters have other desirable properties as oriented feature detectors. The most notable of these is the ability to compute filter responses in arbitrary orientation by weighting the responses of a fixed filter bank with a handful of different orientations.
We have derived analytical steering equations for Gabor filters, which enable Gabor filters to be used as steerable filters. Some families of steerable filters are quite close to Gabor filters in terms of impulse responses, and the steering performance of Gabor filters can be understood via this connection.
Steerable filters provide a computationally efficient way to implement rotation invariant feature detection. With suitable parameters, Gabor filters offer approximative steerability while having also good feature localization capability and angular resolution. The design of a filter bank for feature detection is, in general, a compromise between feature specificity and genericity, but the properties of the filters and their spatial arrangement have a large effect on detection performance.
Figure 10: Feature similarity functions in spatial and orientation dimensions obtained using three different filter banks. Left: Original images, with the matching local maxima marked with ed crosses. Right, top row: a) Inner product similarity using regular Gabor filters. b) Similarity using nearest neighbor rotation invariance with the same filters. The feature loses specificity and localization becomes worse. c) Same filters and rotation invariance method using a different spatial arrangement of the filters. Localization is greatly improved. Bottom row: Corresponding similarities in the orientation dimension evaluated at the spatial local maxima of the similarity functions.
Researchers: Toni Auranen, Iiro P. Jääskeläinen, Jouko Lampinen, Aapo Nummenmaa, Mikko Sams, Aki Vehtari
Statistical brain signal analysis project is multidisciplinary project combining the expertise of both Bayesian Methodology group and Cognitive Science and Technology group.
Localizing the neural currents indicating brain activity based on noninvasiveMEG and EEG measurements (i.e., solving the electromagnetic inverse problem) is most naturally formulated in probabilistic terms and thus becomes a problem of statistical inference. Because of the illposedness of the inverse problem, reliable inference cannot be made based on the data only. Some additional a priori information must be provided to obtain sensible results, necessitating a Bayesian treatment of the problem.
Our aim is to apply the methods of Bayesian data-analysis to the study of cognitive brain functions as revealed by MEG, EEG and fMRI. Our focus is especially on the computationally more intensive methods such as Markov chain Monte Carlo (MCMC). By using a state-of-theart data simulation model, we have studied generalizations of previously proposed MEG/EEG data-analysis methods in collaboration with Massachusetts General Hospital–Harvard Medical School NMR Center (Dr. John W. Belliveau and Dr. Matti S. Hämäläinen).
We performed a Bayesian analysis to the MEG inverse problem with -norm priors. Our model contains as special cases the minimum-norm estimate (MNE; ) and minimum-current estimate (MCE; ), which are both widely used in practice. With our method, the joint posterior distribution of all the model parameters can be obtained, making it possible to investigate the uncertainties of almost equally probable solution estimates rather than only a maximum a posteriori (MAP) estimate. Furthermore, the arbitrary choice between the - and -norm priors is inferred from the data by introducing a continuous parameter p between the limiting cases of MCE and MNE. The method is automatic, yet mathematically very straightforward enabling the addition of almost any kind of feasible prior information to improve the source localization. The results with real somatomotor MEG data look promising (Fig. 11).
Figure 11: a) Measured data plotted on the MEG helmet sensor locations with red color denoting positive and blue color negative values. b) The solution estimate (red line) is plotted against the original data (blue dots) with one sigma error bars. c) Right index finger lifting produces activation contralaterally on the left hemisphere somatomotor hand areas (plotted on inflated brain surface). d) Posterior distribution of parameter p defining the -norm order.
We also proposed an alternative hierarchical extension of the model corresponding to the minimum norm estimate. Instead of assuming a single Gaussian prior for the neural currents, we built a hierarchical structure to the model by imposing individual Gaussian priors with a common hyperprior distribution. This method is applicable to full spatiotemporal datasets without significant increase in computational burden. We have made tentative comparisons of the approach with a recently proposed similar variational Bayesian method.
Researchers: Sami Brandt
The field of computer vision is aimed at the development of intelligent artificial vision systems, and research on image understanding, image analysis and related areas. The geometric branch of computer vision has been focusing on geometry related problems such as autonomous motion detection, motion estimation, imaging geometry estimation, and 3D reconstruction of the scene. Since the solutions must deal with data corrupted by both measurement noise and outliers, statistical approach can seen as a most natural approach.
Our aim has thus been approaching geometric problems from a pure statistical view point. As our recent research activities on this area, we have introduced the probabilistic epipolar constraint (Fig. 12) and developed affine structure-from-motion algorithms by revisiting the problems of affine reconstruction and affine autocalibration. Some of the results have been successfully applied in the image alignment problem in electron tomography.
Figure 12: Illustration of the probabilistic epipolar constraint. (Left) Five points selected on the left image. (Right) One thousand independent samples drawn from each epipolar pdfs which are the probabilistic form of the epipolar lines. The colour indicates the distribution of the probability mass.
Researchers: Jukka Heikkonen, Pasi Jylänki, Laura Kauhanen, Janne Lehtonen, Tommi Nykopp and Mikko Sams
Brain Computer Interfaces (BCIs) enable motor disabled and healthy persons to operate electrical devices and computers directly with their brain activity. Our approach bases on an artificial neural network that recognizes and classifies different brain activation patterns associated with real and attempted movements. We pursuit to develop a robust classifier with short classification time a low misclassification rate. Figure 13 demonstrates our BCI in use.
Figure 13: The user has an EEG cap on. By thinking about left and right hand movement the user controls the movement of the ball.
We are especially interested in the neurophysi- ological basis of BCIs. Before the signals can be classified they need to be understood. We concentrate on activity of the motor cortex.
Like most other BCI groups, we measure the electric activity of the brain using electroencephalography (EEG). In addition, we measure the magnetic activity of the brain with magnetoencephalography (MEG, Fig. 14). MEG signals are more local than EEG signals. This facilitates, e.g., the separation of activities generated in the left and right sensorimotor cortices.
Figure 14: Subject is being prepared for MEG experiment.
We have being working on an online EEG-based BCI. EEG signal related to finger movements are used as input to the classifier. The exact timing of the movements is measured with light ports. An LED provides a cue to move. We are testing the system now with healthy subjects and will modify it to be used also by paralyzed users.
Our research is funded by EU (MAIA project), the Academy of Finland, the Graduate School in Electronics, Telecommunications and Automation and the Finnish Cultural Foundation.
One of the highlights of 2004 was that Laura Kauhanen won the first prize in the poster competition in BCI 2004Workshop and Training Course in Graz, Austria.