Master Thesis
Content
Despite the rapid development of modern computers, there exist crucial applications in embedded devices (EDs) with strict limitations on memory and power. Performing complex tasks, such as acoustic scene classification (ASC) on those EDs, is usually only possible with low-dimensional audio features. To extract reliable information from such limited features (e.g., after connection to a cloud-based server), the goal of this work is to reconstruct log-mel spectrograms from the low-dimensional feature vectors using generative deep neural networks (DNNs), like diffusion models.
Task Description
Start with implementing and evaluating existing baseline DNN architectures. Afterwards, build upon the baselines and develop a set of adaptations (e.g., loss function, architecture) aiming to improve the model performance. Finally, evaluate the generative capability of the model using low-dimensional input features and test the performance of the generated spectrograms in an ASC task.
Requirements
Room:
ID 2/221
Phone: +49 234 32 -
18600
E-Mail
Room:
ID 2/233
E-Mail