arXiv Open Access 2021

Improving Polyphonic Sound Event Detection on Multichannel Recordings with the Sørensen-Dice Coefficient Loss and Transfer Learning

Karn N. Watcharasupat Thi Ngoc Tho Nguyen Ngoc Khanh Nguyen Zhen Jian Lee Douglas L. Jones +1 lainnya
Lihat Sumber

Abstrak

The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-entropy loss often results in suboptimal detection performance as the training is often overwhelmed by updates from negative samples. In this paper, we investigated the effect of the Dice loss, intra- and inter-modal transfer learning, data augmentation, and recording formats, on the performance of polyphonic sound event detection systems with multichannel inputs. Our analysis showed that polyphonic sound event detection systems trained with Dice loss consistently outperformed those trained with cross-entropy loss across different training settings and recording formats in terms of F1 score and error rate. We achieved further performance gains via the use of transfer learning and an appropriate combination of different data augmentation techniques.

Penulis (6)

K

Karn N. Watcharasupat

T

Thi Ngoc Tho Nguyen

N

Ngoc Khanh Nguyen

Z

Zhen Jian Lee

D

Douglas L. Jones

W

Woon Seng Gan

Format Sitasi

Watcharasupat, K.N., Nguyen, T.N.T., Nguyen, N.K., Lee, Z.J., Jones, D.L., Gan, W.S. (2021). Improving Polyphonic Sound Event Detection on Multichannel Recordings with the Sørensen-Dice Coefficient Loss and Transfer Learning. https://arxiv.org/abs/2107.10471

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2021
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓