Psychophysical Research

Researchers: Kaisa Tiippana, Mikko Sams, Tobias Andersen, Christina Krause,
Riikka Möttönen, Klaus Riederer, Jukka Saari, Jyrki Tuomainen

We study the mechanisms of audiovisual speech perception, i.e. how watching the talking face affects the speech percept, by conducting psychophysical experiments. Computational models are also being tested and developed to account for the results. In a typical experiment, an observer is presented with audiovisual speech, i.e. speech sounds through headphones or loudspeakers together with a talking face on a computer monitor, and the observer reports what he or she heard. One of the main research tools is the McGurk effect. For example, an auditory syllable /pa/ is dubbed onto a visual syllable /ka/. This kind of combination is typically perceived as /ta/. The McGurk effect is used to study audiovisual integration since its strength is related to the extent of visual influence in speech perception. The current main research topics are presented below:

Attentional influences on audiovisual speech perception.
We have shown in two different experimental set-ups that distraction of visual attention weakens the McGurk effect, implying that audiovisual integration is not an entirely automatic process.

Audiovisual speech perception in children.
Young children show less visual influence in speech perception than adults. We have shown that children reach adult performance between 8-14 years of age. We are also investigating audiovisual speech perception in children with learning problems who have deficits in categorization of auditory speech and phonological awareness, in collaboration with the Auditory Neuroscience Laboratory at Northwestern University, IL, U.S.

Modelling of audiovisual speech perception.
The above results and others have been modelled with the Fuzzy Logical Model of Perception developed by D. Massaro and co-workers, and it accounts for the results well. The model has also been extended to account for the effect of audio signal-to-noise ratio.

Figure 39

Figure 39: The Fuzzy Logical Model of Perception has been modified to account for the effect of auditory signal-to-noise ratio (-12 to 6 dB) in audiovisual speech perception. The new model accounts for both auditory (top left) and audiovisual (bottom) response distributions rather well).

The effect of virtual sound location on audiovisual speech perception.
Variation in virtual sound location has been accomplished with the aid of head-related transfer functions in order to investigate whether sound location influences audiovisual speech perception.

Articulatory influences on speech perception.
We have found that silent articulation of a vowel influences the subsequent classification of a target vowel continuum by assimilating the ambiguous vowels at the phoneme boundary to the category of the articulated vowel. However, if the context vowel is presented auditorily, a contrast effect is obtained yielding the two vowels more separate in the vowel space. We are now investigating the origin of the two effects.