arXiv Open Access 2023

Detecting False Alarms and Misses in Audio Captions

Rehana Mahfuz Yinyi Guo Arvind Krishna Sridhar Erik Visser
Lihat Sumber

Abstrak

Metrics to evaluate audio captions simply provide a score without much explanation regarding what may be wrong in case the score is low. Manual human intervention is needed to find any shortcomings of the caption. In this work, we introduce a metric which automatically identifies the shortcomings of an audio caption by detecting the misses and false alarms in a candidate caption with respect to a reference caption, and reports the recall, precision and F-score. Such a metric is very useful in profiling the deficiencies of an audio captioning model, which is a milestone towards improving the quality of audio captions.

Topik & Kata Kunci

Penulis (4)

R

Rehana Mahfuz

Y

Yinyi Guo

A

Arvind Krishna Sridhar

E

Erik Visser

Format Sitasi

Mahfuz, R., Guo, Y., Sridhar, A.K., Visser, E. (2023). Detecting False Alarms and Misses in Audio Captions. https://arxiv.org/abs/2309.03326

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓