Researchers: Aki Vehtari, Simo Särkkä and Jouko Lampinen
Main research areas of the Bayesian methodology group are model assessment and the estimation of predictive performance, the elicitation and inclusion of structural information, and advanced dynamic models (see following chapters for case examples).
The model assessment research is based on the decision theoretic approach of examining predictive distributions and the relative values of consequences of using the models. The main emphasis in model assessment research is how to estimate the predictive performance of complex models and how to estimate the associated uncertainties in these estimates. Our research is based on cross-validation ideas, because it has several benefits over other approaches. For example, it can be used for arbitrary likelihoods and utility functions and it does not rely on asymptotic approximations. Main contributions have been in theoretical and methodological advances, which provide solid framework to assess the performance of the complex models, while taking properly into account the associated uncertainties. Important work in progress is model selection in case of large number of models.
Other important methodological research topic is how to elicit the expert knowledge and transfer it to a probabilistic model in application problems. Examples of important model concepts used are:
To be able to tackle more challenging scientific problems, it is necessary to research methods for constructing more elaborate models and elicitation of the prior knowledge from the expert of the applied research area. Complex models may have a large number of unknown parameters, for example, thousands in brain signal analysis, which will cause difficulties for computational methods in Bayesian integration. Bayesian methodology group supports applied Bayesian research in the laboratory by providing expertize in model construction and computation.
For example, methods developed in the group were used in concrete quality prediction problem in collaboration with concrete expert Dr.Tech Hanna Järvenpää (Lohja Rudus Oy). The model assessment methods had important part in describing the reliability of the predictions. Using Bayesian modeling in this challenging problem produced excellent results. By using the models and conclusions based on them made by the concrete expert, it is possible to achieve 5-15% savings in concrete factory. Furthermore, it is possible to reduce the proportion of natural gravel from 50–100% to 5-20% and thus help saving non-renewable natural resources.
Researchers: Toni Tamminen and Jouko Lampinen
The goal of the project is to develop a system that can locate and recognize objects in a natural scene. In our approach we study model based methods using full Bayesian inference. The objects in a scene are defined by prior models that are learned from example images.
We have developed a distortion tolerant feature matching method based on probability distributions of Gabor filter responses. An object is defined as a set of locations, with associated Gabor-features, and a prior model that defines the variations of the feature locations. The eigenshape model corresponds to determining the covariance matrix for the feature locations, which is learned in bootstrap fashion by matching a large number of images by simpler prior models.
We also have constructed efficient MCMC samplers for drawing samples from the posterior distributions of the feature locations, mainly using Gibbs and Metropolis sampling. Currently we are studying a sequential Monte Carlo approach which will greatly increase the speed of the sampling process. In sequential matching we match the features one after another, using the information about the locations of the previously matched features to aid in the matching of new features. Figure 1 illustrates the object shape model by showing the first few eigenshapes and Figure 2 shows an example of the sequential matching process.
Figure 1: Leading eigenshapes of faces, learnt from a set of training images. The face on the left has been morphed according to the eigen shapes, into positive direction (upper row) and negative direction (lower row). It can be seen that components 2 and 3 are related to rotations of the head, while components 1, 4, and 5 are shape-related.
Figure 2: Sequential feature matching. The red dots mark the drawn locations of the current feature, while the green dots are the previously drawn features. The shape (yellow lines) is computed from the object model given the drawn features.
Researchers: Timo Kostiainen, Jouko Lampinen
The goal of this work is to develop computationally efficient techniques for the division of natural colour images into meaningful segments. The results can be applied in further processing of the image, for example in object recognition.
The use of a probabilistic approach and numerical Markov chain Monte Carlo (MCMC) methods have recently produced promising results in image segmentation. The approach is based on defining a statistical texture model for the image. The stochastic MCMC algorithm is a top-down process in which a very large number of proposal samples are generated and their likelihoods are evaluated against the texture model. Evaluation of the proposals is computationally intensive, and the complexity depends on the quality of the proposal samples.
We have developed methods for producing efficient proposal samples to reduce the computational complexity. We do this by utilizing the bottom-up information that the image probability model provides, as well as cues such as edges. In Figure 3, the advantage is illustrated by comparing the method to the case where no bottom-up information is used.
Figure 3: Efficient proposals vs. random proposals. Left: result of our segmentation method after 179 samples. Center: result of segmentation without bottom-up information in the same processing time (365 samples). Right: evolution of the posterior probabilities in the MCMC chains with (continuous) and without (dashed) bottom-up information.
The results of the MCMC algorithm are in the form of a large number of weighted samples from the posterior distribution. In many statistical analysis problems the distribution can be easily interpreted in terms of descriptive statistics. In the case of image processing, the result is a set of different segmentations of the image, which is awkward for visualization and further processing. Analysis of the posterior distribution is another part of this work.
Researchers: Ilkka Kalliomäki, Jouko Lampinen
Gabor filters are information-theoretically optimal oriented bandpass filters which have been traditionally used in pattern recognition as a generic framework for the representation of visual images. Gabor-based features are widely used in face recognition, for example. Neurological studies have found Gabor-type structures on the visual cortex of mammals. This fact suggests that the Gabor representation is an efficient one in pattern recognition tasks.
Steerable filters are another variety of 2D oriented filters. They have applications in a wide variety of early vision tasks including edge detection, orientation analysis, texture analysis and stereo vision. While non-optimal in terms of joint space-frequency uncertainty principle, steerable filters have other desirable properties as oriented feature detectors. The most notable of these is the ability to compute filter responses in arbitrary orientation by weighting the responses of a fixed filter bank with a handful of different orientations. This property is known as ’steerability’.
We have derived analytical steering equations for Gabor filters, which enable Gabor filters to be used as steerable filters. Some families of steerable filters are quite close to Gabor filters in terms of impulse responses, and the steering performance of Gabor filters can be understood via this connection. However, the steerability of Gabor filters is only approximate, and the accuracy depends heavily on the parameters and the number of different orientations in the bank.
We intend to apply the results to pose-invariant face recognition, where the variability of features due to orientation is a central problem. Using steerability, a simple and computationally efficient measure of feature similarity which is invariant to rotations in the image plane may be developed.
Figure 4: Steering Gabor filter bank into arbitrary orientations. First and second row portray Gabor filters in the spatial domain, and third and fourth row show the same filters in the frequency domain. Leftmost and rightmost columns correspond to known orientations in a bank of five different orientations. The four intermediate filters have been computed via analytical steering, and they approximate exact Gabor filters in the corresponding orientations very well both in spatial and frequency domains.
Researchers: Simo Särkkä, Toni Tamminen, Aki Vehtari and Jouko Lampinen
The goal of the project was to develop Bayesian Sequential Monte Carlo based algorithms for multiple target tracking in multi-sensor environment. The idea of multiple target tracking is to optimally fuse information from sensor measurements and modeled target dynamics to form best possible estimates of states of multiple targets (e.g., positions and velocities) and their uncertainties. The models and methods used in this project were based on Bayesian filtering theory.
The main topic of the project was data association, which makes multiple target tracking much harder task than single target tracking and it also rules out usage of basic Kalman and Extended Kalman filters. In multiple target tracking, the algorithm has to estimate which of the targets produced the measurements, before it is able to use the measurements in actual tracking. In this project we developed new Rao-Blackwellized Monte Carlo data association algorithm, which efficiently solves the joint data association and tracking problem.
The secondary topic of project was modeling of negative information. In a target tracking system physical sensors report measurements only when they receive some kind of signal, which can be further processed into a measurement. In the single-sensor case all we can do is to use these signal-induced measurements. However, when there are multiple sensors measuring from the same origin (e.g. the radar of a target), and some sensors can detect this signal and some cannot, our information is increased by knowing the fact that some sensors could not detect the target. This is called negative information.
Figure 5 shows an example of classical bearings only multiple target tracking problem, which frequently arises in context of passive sensor tracking. The particles in the figures are used for visualizing the distribution, such that the particles are a random sample drawn from the posterior distribution estimate. The actual posterior distribution estimate is a mixture of Gaussians, which is hard to visualize directly. The prior distribution is on purpose selected such that all the four crossings of measurements from the two sensors contain some probability mass, and the distributions of targets are two-modal as can be seen in Figure 5, causing so called ghost phenomenon as false detections.
Figure 5: Initially (left) half of the prior probability mass is located on the ghost sensor measurement crossings and in the beginning of tracking (middle) the multi-modality of posterior distribution can be clearly seen. After a while (right) the posterior distribution changes uni-modal due to restrictions set by the dynamic model.
Researchers: Juho Kannala, Jukka Laurila, Sami Brandt, Aki Vehtari and Jouko Lampinen
The aim of the project is to develop methods for the analysis of video sequences that are scanned by a robot moving in the sewer. The project is done in co-operation with the VTT Building and Transport and is funded by TEKES. The work can be divided into two parts: 1) Automatic detection of pipe surface defects and pipe joints. 2) Automatic reconstruction of the 3D shape of the pipe. Displaced joints and surface cracks are among the most common types of defects in a sewer pipe.
Detecting the cracks is especially challenging because of the large variation in surface texture. We have applied several line detection algorithms for the detection of cracks in the pipe surface and joints between pipe sections. The method illustrated in Figure 6 is based on forming an approximate Hessian for each pixel in the image. The Hessian has a large positive eigenvalue where there is a dark line in the image; these are crack candidates. Post-processing includes thresholding with hysteresis and thresholding by feature size.
Figure 6: Crack detection results. (a) Original image of ’unwrapped’ pipe surface. (b) Detected cracks and joints.
The information of the shape of the sewer pipe is important, because the bendings and compressions may indicate upcoming failure. In order to obtain 3D information from the video the imaging geometry of the fish-eye lens camera must be determined. In the project we have developed an accurate and easy-to-use method for the calibration of fish-eye lenses. The calibration is possible by using only one view of a planar calibration object as Figure 7 illustrates. Next, after solving the problem of calibration, we will be able to use known multiple view techniques to track points through the image sequence and to reconstruct them. However, there are still many practical and computational problems to be solved before a reliable reconstruction is obtained.
Figure 7: Fish-eye lens calibration using only one view. (a) Original image. (b) The image corrected to follow the pinhole model. Straight lines are straight as they should be.
Researchers: Sami Brandt and Jukka Heikkonen
The field of computer vision is aimed at the development of intelligent artificial vision systems, and research on image understanding, image analysis and related areas. The geometric branch of computer vision has been focusing on geometry related problems such as autonomous motion detection, motion estimation, imaging geometry estimation, and 3D reconstruction of the scene. Since the solutions must deal with data corrupted by both measurement noise and outliers, statistical approach can seen as a most natural approach.
Our aim has thus been approaching geometric problems from a pure statistical view point. We have been contributing e.g. by developing a robust estimator that has been proved optimal in the sense of consistence with similar assumptions to the ordinary maximum likelihood estimator. The estimator has been applied, for instance, in two-view geometry and its uncertainty estimation with both affine and projective camera models. Other contributions include novel statistical reconstruction algorithms and a probabilistic formulation for the two-view, epipolar constraint (Fig. 8).
Figure 8: A point in the left corner of the mouth is selected in the left image of a stereo image pair (not shown here). (a) The maximum likelihood robust estimate for the epipolar line (dashed) and the estimated, conventional confidence intervals of the epipolar line. (b) Probability distribution characterising the probability that any point the second image is on the true epipolar line. This is, in fact, a probabilistic representation for the epipolar line where the uncertainty of the estimated epipolar geometry has been taken into account. (c) One thousand independent samples drawn from the probability distribution. (Original image copyright belongs to INRIA-Syntim.)
Researchers: Timo Koskela, Jukka Heikkonen, and Kimmo Kaski
Web caching is a technique where Web objects requested by clients are stored in a cache which is located near the clients. Subsequent requests for the same object are then served from the cache, improving the response time for the end users, reducing the overall network traffic, and reducing the load on the server.
Figure 9 (on the left) shows how the requests from the clients are routed through the proxy, which fetches the objects and stores them to cache. Since the storage space of the cache is limited, an important problem in optimizing cache’s operation is to decide which strategy to use in replacing the cache objects. Typically heuristic rules, such as the LRU (replace the least recently used object) are used for the purpose. Our proposed model predicts the popularity of each object by utilizing syntactic features collected from the HTTP responses and from the HTML structure of the document. Replacement strategy can then be optimized by taking the predicted object popularities into account.
Figure 9: Left: Proxy cache fetches and stores the objects requested by the clients. Right: Hit rate for LRU and LRU-C (50% and 100% accuracy of the classifier) as a function of the cache size.
In a case study, about 50000 HTML documents were classified according to their popularity by using linear and nonlinear models. Results showed that linear model could not find correlation between the features and document popularity. Nonlinear model gave better results, yielding mean classification percentages of 64 and 74 for the documents to be stored or to be removed from the Web cache, respectively.
The gained performance improvement was demonstrated by simulations with two common heuristic replacement rules, the LRU and the GDS. Object requests were generated with an analytical model, in order to take different object popularity and object size distributions into account. Figure 9 (on the right) shows the average hit rate (portion of the objects delivered straight from the cache) in the LRU case. The LRU-C utilizes the classification results for the cache objects, and provides a significant improvement in hit rate for all realistic cache sizes.
Researchers: Jukka Heikkonen, Laura Laitinen, Janne Lehtonen, Tommi Nykopp and Mikko Sams
Brain Computer Interfaces (BCIs) are intended for enabling both the severely motor disabled as well as the healthy people to operate electrical devices and applications directly with brain activity. Our approach bases on an artificial neural network that recognizes and classi- fies different brain activation patterns associated motor tasks.. By this means we pursuit to develop a robust classifier with short classification time and, most importantly, a low rate of false positives (i.e. wrong classifications). Figure 10 demonstrates a BCI in use.
Our group is especially interested in the neurophysiological basis of BCIs. We believe that before the signals can be classified they need to be well understood. We are especially interested in the activation of the motor cortex. Like most BCI groups, we measure the electric activity of the brain using electroencephalography (EEG). In addition to EEG, we measure the magnetic activity of the brain with magnetoencephalography (MEG). MEG signals are more localized than EEG signals and thus give us more accurate information about the brain activity related to, e.g., finger movements. We study the signals, e.g., using time frequency representations (TFRs) and pick out important features from them. Figure 11 shows an example of a TFR.
During year 2003 we collected data on attempted finger and hand movements from quadriplegic people using combined MEG and EEG. The research was done in Low Temperature Laboratory in collaboration with Käpylä’s kuntoutuskeskus. We have developed a Matlab-based BCI platform for offline as well as online use. The platform is not dependent on measuring device or operating system. It could be used as fast prototyping tool for testing different BCI signal analysis as well as feedback methods. We intend to build online EEG-BCI using this platform and later MEG-BCI. In the field of the signal analysis sequential classi- fication was introduced. Preliminary results show significant improvement in comparison to many previously used methods.
Figure 10: The user has an EEG cap on. By thinking about left and right hand movement the user controls the virtual keyboard with her brain activity.
Figure 11: TFR of a MEG sensor over the motor cortex. The activation of the brain is plotted with the time information on the xaxis and the frequency information on the y-axis. Subject began to move his right finger at time zero. Strong activation in the 10- 30 Hz range can be detected after the movement has ended.