arXiv Open Access 2025

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Reinhold Haeb-Umbach Tomohiro Nakatani Marc Delcroix Christoph Boeddeker Tsubasa Ochiai

Lihat Sumber

Abstrak

Multi-channel acoustic signal processing is a well-established and powerful tool to exploit the spatial diversity between a target signal and non-target or noise sources for signal enhancement. However, the textbook solutions for optimal data-dependent spatial filtering rest on the knowledge of second-order statistical moments of the signals, which have traditionally been difficult to acquire. In this contribution, we compare model-based, purely data-driven, and hybrid approaches to parameter estimation and filtering, where the latter tries to combine the benefits of model-based signal processing and data-driven deep learning to overcome their individual deficiencies. We illustrate the underlying design principles with examples from noise reduction, source separation, and dereverberation.

Topik & Kata Kunci

eess.AS cs.SD eess.SP

Penulis (5)

Reinhold Haeb-Umbach

Tomohiro Nakatani

Marc Delcroix

Christoph Boeddeker

Tsubasa Ochiai

Format Sitasi

APA MLA BibTeX

Haeb-Umbach, R., Nakatani, T., Delcroix, M., Boeddeker, C., Ochiai, T. (2025). Microphone Array Signal Processing and Deep Learning for Speech Enhancement. https://arxiv.org/abs/2501.07215

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓