Cognitive technology utilizes knowledge of neurocognitive mechanisms of communication and cognition in information technology. Such knowledge is crucial, e.g., in constructing advanced user interfaces. In human-machine interaction the limitations, e.g., of memory and attention have to be taken into account. One of the big challenges is to create interfaces which enable interaction which is natural to human beings. The computers should recognize and produce natural speech. In future, brain activity measured, e.g., by EEG, may be used directly to control computers.
Knowledge on the perceptual systems can be used to improve the perceptual quality of compressed audio signals and images. On the other hand, the understanding of the basic principles of neurocognitive communication mechanisms makes it possible to develop advanced aids for disabled citizens. Examples are hearing aids, substitute ways of delivering visual information to blind people, and intelligent prostheses.
Cognitive science and technology in the Laboratory of Computational Engineering concentrates on basic research and modelling the integration of multisensory and especially audiovisual information. Information processing is inherently multimodal. Separate sensory inflows are integrated in the human brain for object perception, which makes the information processing more effective. However, in contrast to solitary sensory systems there is very little knowledge of the neurocognitive mechanisms of audiovisual integration. This is somewhat surprising because the applications of multimodal and audiovisual technology are fast-growing branches of information technology.
In social communication, speech perception is audiovisual. In addition to hearing the speaker we also see her face and body. Visual information gives us non-linguistic information of, e.g., the speakerís identity, age, emotions, and spatial location. However, facial gestures during speech also deliver highly relevant linguistic information. The advantage of visual information is especially evident under noisy conditions. For example, the improvement in speech recognition brought by seeing the speaker may correspond to 15-20 dB increase in the acoustical signal-to-noise ratio. Especially those speech features, which are difficult to distinguish auditorily, and which are commonly masked by noise are easily separated by vision, and vice versa. Like humans, also automatic speech recognizers gains from adding visual information of articulatory movements. By including facial animation, speech synthesisers can be made audio-visual. This helps intelligibility especially under noisy surroundings and can be very helpful for hard-of-hearing people.