Semantic Scholar Open Access 2023 20 sitasi

HABLA: A Dataset of Latin American Spanish Accents for Voice Anti-spoofing

P. Florez R. Manrique B. Nunes

Abstrak

Research on improving automatic speaker verification systems to detect speech spoofing has focused mainly on English, with little attention given to other languages creating a significant gap in language coverage. This paper introduces HABLA, the first voice anti-spoofing dataset in the Spanish language including Argentinian, Colombian, Peruvian, Venezuelan, and Chilean accents. The dataset provided by HABLA comprises over 22,000 authentic speech samples from male and female speakers hailing from five distinct Latin American nations as well as 58,000 spoof samples that were generated through the use of six different speech synthesis strategies, including recent voice conversion and text-to-speech algorithms. Finally, initial findings on the efficacy of pre-existing Antispoofing Systems models are presented along with concerns regarding their performance in languages other than English.

Topik & Kata Kunci

Penulis (3)

P

P. Florez

R

R. Manrique

B

B. Nunes

Format Sitasi

Florez, P., Manrique, R., Nunes, B. (2023). HABLA: A Dataset of Latin American Spanish Accents for Voice Anti-spoofing. https://doi.org/10.21437/interspeech.2023-2272

Akses Cepat

Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
20×
Sumber Database
Semantic Scholar
DOI
10.21437/interspeech.2023-2272
Akses
Open Access ✓