arXiv Open Access 2022

An Environmental Feature Representation in I-vector Space for Room Verification and Metadata Estimation

Desmond Caulley
Lihat Sumber

Abstrak

This paper investigates the application of environmental feature representations for room verification tasks and acoustic meta-data estimation. Audio recordings contain both speaker and non-speaker information. We refer to the non-speaker-related information, including channel and other environmental factors, as e-vectors. I-vectors, commonly used in speaker identification, are extracted in the total variability space and capture both speaker and channel-environment information without discrimination. Accordingly, e-vectors can be extracted from i-vectors using methods such as linear discriminant analysis. In this paper, we first demonstrate that e-vectors can be successfully applied to room verification tasks with a low equal error rate. Second, we propose two methods for estimating metadata information -- signal-to-noise (SNR) and reverberation (T60) -- from these e-vectors. When comparing our system to contemporary global SNR estimation methods, in terms of accuracy, we perform favorably even with low dimensional i-vectors. Lastly, we show that room verification tasks can be improved if e-vectors are augmented with the extracted metadata information.

Topik & Kata Kunci

Penulis (1)

D

Desmond Caulley

Format Sitasi

Caulley, D. (2022). An Environmental Feature Representation in I-vector Space for Room Verification and Metadata Estimation. https://arxiv.org/abs/2203.04880

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓