speech and audio Signal processing
The Institute of Communication Acoustics is actively engaged
in the area of speech and audio signal processing. We welcome students
wishing to do their student projects/Dipl.-/Bachelor- and Masters' theses with us and offer the opportunity to participate in, and contribute
to advancing the technologies in this field.
Speech is not only the most important means of communication between humans, but it also plays an important role in speech controlled human-machine interfaces.
Speech signal processing encompasses a wide range of topics, e.g.,
- Speech Enhancement
- Statistical Models of Speech and Audio Signals
- Noise Power Estimation
- Source Localization and Separation
- Robust Speech Recognition
- Robust Speech Transmission and Voice over IP
- Selective Temporal Cepstrum Smoothing for Speech Enhancement
- Auditory Virtual Environments
Figure 1 shows the interconnection between these various aspects of speech signal processing. The acoustic signal is picked up by a single microphone or an array of microphones. Using arrays enables spatial selectivity of the signals and requires, in practice, a priori knowledge of the speaker positions, automatic speaker localization algorithms, or so-called 'blind' approaches.
Bild 1: Algorithm for speech-signal processing
Often, noise reduction (separation of information-bearing signals and interference, based on statistical features) and echo compensation are also included at this stage. The signal thus obtained is compressed, and either: transmitted (e.g., over a mobile transmission channel) after performing error-control coding, or input to a speech-controlled human-machine interface.
The principle of such an interface is illustrated in Figure 2. The processed signal from the previous step is the input to a speech recognizer. The speech recognizer passes the recognized words to an approapriate language-based module, which finally communicates with a dialog manager. The response from the dialog manager can, with the help of a speech synthesizer, be resynthesized into an acoustic signal.
At the Institute of Communication Acoustics, research is conducted in the area of multi-modal human-machine interaction as applied to, amongst others, virtual environments. One of the current results of this research is an interactive environment that can be loaded and executed in a web browser.