Researchers: Kaisa Tiippana, Mikko Sams Riikka Möttönen, and Toni Sihvonen
Our main interest is to study the mechanisms of audiovisual speech perception. In other words, our aim is to determine how seeing the talking face affects the speech percept. This will be done by conducting various psychophysical experiments. Computational models will be developed to account for the results. The research will enhance our understanding of the basic mechanisms of human speech perception, and will have applications in communication technology, for example in the development of automatic audiovisual speech recognizers.
In a typical experiment, an observer is presented with audiovisual speech, i.e. speech sounds through headphones or loudspeakers together with a talking face on a computer monitor, and the observer reports what he or she heard. By analysing the responses, mechanisms of speech perception can be inferred. One of the main research tools is the McGurk effect, in which discordant auditory and visual stimuli are presented. For example, an auditory syllable /pa/ is dubbed onto a visual syllable /ka/. This kind of combination is perceived as /ta/ by most people. The McGurk effect is used to study audiovisual integration since its strength is directly related to the extent of integration.
The main projects thus far have been (1) attentional influences on audiovisual integration of speech, (2) the effect of articulation on speech perception, and (3) an intelligibility study of the audiovisual speech synthesizer. We have shown that distraction of visual attention weakens the McGurk effect which indicates that audiovisual integration is not an entirely automatic process. We have also demonstrated that speech perception is affected when the observer watches his or her own face in a mirror, and moreover, that the observer's own silent articulation changes the perception of an external auditory speech signal. The latter result evinces the importance of motor signals from the articulators to speech perception. The intelligibility study has confirmed that our audiovisual speech synthesizer significantly enhances the recognition of speech in noisy situations, even though there is still room for improvement since the recognition rates were even higher for a natural face. The changes indicated by the study are now being made to the audiovisual speech synthesizer, which will then be re-tested.