Speech Enhancement
Speech Enhancement is one of our central research topics. There are many applications such as mobile voice communications, hearing aids and human-machine interfaces - and there are many methods. We focus on noise reduction with the goal to improve listener comfort and to increase the intelligibility of the acoustic signal. We employ methods based on single microphone signals as well as multiple microphone signals (microphone arrays). The development of speech enhancement methods requires a blend of physical modeling and statistical signal processing techniques. Most of our enhancement techniques operate in the spectral domain. Typically, the noisy speech signal is segmented into short frames, transformed, enhanced, inverse transformed, and overlap-added to reconstruct the enhanced signal (see Figure). The benefits of spectral processing are
- a concentration of speech energy in few spectral parameters (especially for voiced speech),
- a simpler statistical description as compared to the time domain, and
- possibly an application of psychoacoustic principles.
The block diagram of a typical system
is shown below, [Malah et al., 2004].
References:
Martin, R.: Statistical
Methods for the Enhancement of Noisy Speech. In: Speech
Enhancement, J. Benesty et al. (eds), Springer-Verlag
2005
Martin, R., Malah,
D., Cox, R.V., Accardi, A.J. (2004). "A Noise Reduction
Preprocessor for Mobile Voice Communication, JASP No.8,
pp. 1046-1058."
Martin, R.: Statistical
Methods for the Enhancement of Noisy Speech. Proc. Intl.
Workshop Acoustic Echo and Noise Control (IWAENC), pp.
1-6, 2003.
Breithaupt, C; Martin,
R.: MMSE Estimation of Magnitude-Squared DFT Coefficients
with Supergaussian Priors. Proc. IEEE Intl. Conf. Acoustics,
Speech, Signal Processing (ICASSP), 2003.

