DOAJ Open Access 2024

Analysis of spatial filtering in neural spatiospectral filters and its dependence on training target characteristics

Annika Briegleb Walter Kellermann

Abstrak

Abstract Mask-based multichannel speech enhancement methods based on artificial neural networks estimate a mask that is applied to the multichannel input signal or a reference channel to obtain the estimated desired signal. For the estimation, both spectral and spatial cues from the multichannel input can be used. However, the interplay of the two inside the neural network is typically unknown. In this contribution, we propose a framework to analyze neural spatiospectral filters (NSSFs) with respect to their capabilities to extract and represent spatial information. We explicitly take the characteristics of the training target signal into account and analyze its effect on the functionality of the NSSF. Using two conceptually different NSSFs as example, we show that not all NSSFs use spatial information under all circumstances and that the training target signal has a significant influence on the spatial filtering behavior of an NSSF. These insights help to assess the signal processing capabilities of neural networks and allow to make informed decisions when configuring, training, and deploying NSSFs.

Penulis (2)

A

Annika Briegleb

W

Walter Kellermann

Format Sitasi

Briegleb, A., Kellermann, W. (2024). Analysis of spatial filtering in neural spatiospectral filters and its dependence on training target characteristics. https://doi.org/10.1186/s13636-024-00381-3

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1186/s13636-024-00381-3
Informasi Jurnal
Tahun Terbit
2024
Sumber Database
DOAJ
DOI
10.1186/s13636-024-00381-3
Akses
Open Access ✓