Multimodal Signal Processing Highlights

Audiovisual Speech Inversion

We focus on recovering aspects of vocal tract’s geometry and dynamics from speech, a problem referred to as speech inversion.

Audiovisual Speech Recognition

We have developed highly adaptive multimodal fusion rules based on uncertainty compensation which are compatible with synchronous and asynchronous multimodal interaction architectures.

Movie Summarization

Detection of perceptually important video events is formulated on the basis of saliency models for the audio, visual and textual information conveyed in a video stream.

Sign Language Recognition

We have developed a visual processing framework for hands and head tracking and feature extraction from SL/gesture videos.

Microphone array speech processing

We are working on microphone array processing and distant speech recognition, aiming to create hands-free, voice-enabled interfaces for home automation control.