arXiv Open Access 2025

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Reinhold Haeb-Umbach Tomohiro Nakatani Marc Delcroix Christoph Boeddeker Tsubasa Ochiai
Lihat Sumber

Abstrak

Multi-channel acoustic signal processing is a well-established and powerful tool to exploit the spatial diversity between a target signal and non-target or noise sources for signal enhancement. However, the textbook solutions for optimal data-dependent spatial filtering rest on the knowledge of second-order statistical moments of the signals, which have traditionally been difficult to acquire. In this contribution, we compare model-based, purely data-driven, and hybrid approaches to parameter estimation and filtering, where the latter tries to combine the benefits of model-based signal processing and data-driven deep learning to overcome their individual deficiencies. We illustrate the underlying design principles with examples from noise reduction, source separation, and dereverberation.

Topik & Kata Kunci

Penulis (5)

R

Reinhold Haeb-Umbach

T

Tomohiro Nakatani

M

Marc Delcroix

C

Christoph Boeddeker

T

Tsubasa Ochiai

Format Sitasi

Haeb-Umbach, R., Nakatani, T., Delcroix, M., Boeddeker, C., Ochiai, T. (2025). Microphone Array Signal Processing and Deep Learning for Speech Enhancement. https://arxiv.org/abs/2501.07215

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓