Semantic Scholar Open Access 2023 20 sitasi

HABLA: A Dataset of Latin American Spanish Accents for Voice Anti-spoofing

P. Florez R. Manrique B. Nunes

Abstrak

Research on improving automatic speaker verification systems to detect speech spoofing has focused mainly on English, with little attention given to other languages creating a significant gap in language coverage. This paper introduces HABLA, the first voice anti-spoofing dataset in the Spanish language including Argentinian, Colombian, Peruvian, Venezuelan, and Chilean accents. The dataset provided by HABLA comprises over 22,000 authentic speech samples from male and female speakers hailing from five distinct Latin American nations as well as 58,000 spoof samples that were generated through the use of six different speech synthesis strategies, including recent voice conversion and text-to-speech algorithms. Finally, initial findings on the efficacy of pre-existing Antispoofing Systems models are presented along with concerns regarding their performance in languages other than English.

Topik & Kata Kunci

Computer Science

Penulis (3)

P. Florez

R. Manrique

B. Nunes

Format Sitasi

APA MLA BibTeX

Florez, P., Manrique, R., Nunes, B. (2023). HABLA: A Dataset of Latin American Spanish Accents for Voice Anti-spoofing. https://doi.org/10.21437/interspeech.2023-2272

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.21437/interspeech.2023-2272

Informasi Jurnal

Tahun Terbit: 2023
Bahasa: en
Total Sitasi: 20×
Sumber Database: Semantic Scholar
DOI: 10.21437/interspeech.2023-2272
Akses: Open Access ✓