TO TOP

Reconstruction of High-Dimensional Audio Features with Generative Neural Networks

Master Thesis

generative_short

Content
Despite the rapid development of modern computers, there exist crucial applications in embedded devices (EDs) with strict limitations on memory and power. Performing complex tasks, such as acoustic scene classification (ASC) on those EDs, is usually only possible with low-dimensional audio features. To extract reliable information from such limited features (e.g., after connection to a cloud-based server), the goal of this work is to reconstruct log-mel spectrograms from the low-dimensional feature vectors using generative deep neural networks (DNNs), like diffusion models.

Task Description
Start with implementing and evaluating existing baseline DNN architectures. Afterwards, build upon the baselines and develop a set of adaptations (e.g., loss function, architecture) aiming to improve the model performance. Finally, evaluate the generative capability of the model using low-dimensional input features and test the performance of the generated spectrograms in an ASC task.

Requirements

  • Strong Python or comparable programming skills 
  • Interested in acoustic signal processing and DNNs
  • Experience in training DNNs is helpful
  • Can be written in English/German
Contact