speech and audio Signal processing

The Institute of Communication Acoustics is actively engaged in the area of speech and audio signal processing. We welcome students wishing to do their student projects Bachelor- and Masters' theses with us and offer the opportunity to participate in, and contribute to advancing the technologies in this field.

Speech is not only the most important means of communication between humans, but it also plays an important role in speech controlled human-machine interfaces.

Speech signal processing encompasses a wide range of topics, e.g.,

Figure 1 shows the interconnection between these various aspects of speech signal processing. The acoustic signal is picked up by a single microphone or an array of microphones. Using arrays enables spatial selectivity of the signals and requires, in practice, a priori knowledge of the speaker positions, automatic speaker localization algorithms, or so-called 'blind' approaches.

Algotithmen der Sprachkommunikation

Bild 1: Algorithm for speech-signal processing


Often, noise reduction (separation of information-bearing signals and interference, based on statistical features) and echo compensation are also included at this stage. The signal thus obtained is compressed, and either: transmitted (e.g., over a mobile transmission channel) after performing error-control coding, or input to a speech-controlled human-machine interface.

The principle of such an interface is illustrated in Figure 2. The processed signal from the previous step is the input to a speech recognizer. The speech recognizer passes the recognized words to an approapriate language-based module, which finally communicates with a dialog manager. The response from the dialog manager can, with the help of a speech synthesizer, be resynthesized into an acoustic signal.

At the Institute of Communication Acoustics, research is conducted in the area of multi-modal human-machine interaction as applied to, amongst others, virtual environments. One of the current results of this research is an interactive environment that can be loaded and executed in a web browser.