Low-Latency and Low-Complexity Single-Channel Speech Enhancement

RUB » Institute of Communication Acoustics » Teaching » Student Theses

Low-Latency and Low-Complexity Single-Channel Speech Enhancement

Master Thesis

Background and Motivation
In single-channel speech enhancement (SE) systems, a monaural noisy signal is denoised using either model-based spectral filtering techniques or deep neural network (DNN)-based approaches to estimate the underlying clean speech source signal. Typical applications include hearing assistive devices such as hearing aids and cochlear implants, in which low-latency is crucial due to potential direct leakage of unprocessed sound into the enhanced signal. In addition, embedded platforms such as head-mounted devices impose constraints on computational complexity, driven by limitations in battery capacity, memory access, and available processing power. This thesis aims to systematically analyze and improve recent low-latency, low-complexity speech enhancement approaches.

Task Description

The thesis will initially focus on the following fundamental tasks:
Review of low-latency and low-complexity single-channel noise reduction (NR) literature
Building a training and test data set from public databases for experiments
Building a processing framework, to insert model-based and DNN-based noise reduction
Implementation and assessment of (a-)symmetric analysis and synthesis (AS) windows

Furthermore, the thesis will pursue the following targeted developments:

Implementation and Assessment of DNN-based "learned" asymmetric AS windows
Experimenting with DNN structure to reduce size and with model-based NR to embed into the DNN-based backpropagation

Requirements

Knowledge and experience in the field of speech enhancement
Basic programming skills and experience in signal processing (Python / MATLAB)

Contact

Stefan Thaleiser, M.Sc.

Room: ID 2/261
Phone: +49 234 32 - 18597
E-Mail

Prof. Dr.-Ing. Rainer Martin

E-Mail