Researchers: Tobias Andersen, Kalle Korhonen,Christina Krause, Riikka Möttönen, Klaus Riederer, Mikko Sams, Kaisa Tiippana
Our main interest is to study the mechanisms of audiovisual speech perception, i.e. to determine how seeing the talking face affects the speech percept, by conducting psychophysical experiments. Computational models are also being tested and developed to account for the results. The research will enhance our understanding of the basic mechanisms of human speech perception, and will have applications in communication technology, for example in the development of automatic audiovisual speech recognizers.
In a typical experiment, an observer is presented with audiovisual speech, i.e. speech sounds through headphones or loudspeakers together with a talking face on a computer monitor, and the observer reports what he or she heard. By analysing the responses, mechanisms of speech perception can be inferred. One of the main research tools is the McGurk effect in which discordant auditory and visual stimuli are presented. For example, an auditory syllable /pa/ is dubbed onto a visual syllable /ka/. This kind of combination is perceived as /ta/ by most people. The McGurk effect is used to study audiovisual integration since its strength is related to the extent of integration.
The main projects have been investigating
We have shown that distraction of visual attention weakens the McGurk effect, which indicates that audiovisual integration is not an entirely automatic process. Decreasing the audio signal-to-noise ratio strengthens the influence of visual speech in a similar way regardless of whether the ratio is manipulated by adding different levels of noise to the speech signal or whether the signal intensity is reduced. The FLMP has been tested on the latter, extensive experimental results, and it can account for the results remarkably well.
We have also developed a software package for conducting behavioural studies which enables the presentation of dynamic audiovisual stimuli such as video clips in a controlled way, and the recording of subjects' responses through keyboard, mouse or a separate response pad. The software is flexible allowing various variations in the experimental set-up such as response determination (e.g. set by experimenter or free-choice by subject), number and order of stimulus presentations, grouping of stimuli, and determining which responses are recorded. The software will make it easy to conduct more extensive and complex experiments than was previously possible.
Research in three-dimensional auditory perception has been carried out applying both behavioral and computational approaches. The key issue is to apply digital filters to the measured (individual) head-related transfer functions (HRTFs) that comprise the basis for human spatial hearing. In the method used a single computed, perceptually justified, value reveals the deviations from the common HRTF database. This basic investigation shows the high baseline quality of HRTFs for the perceptual experiments. The accuracy of measurements has been proved by extensive replicability measures. We have also demonstrated the effects of clothes and hairstyle to the measured HRTF data.