arXiv Open Access 2023

AfriNames: Most ASR models "butcher" African Names

Tobi Olatunji Tejumade Afonja Bonaventure F. P. Dossou Atnafu Lambebo Tonja Chris Chinenye Emezue +2 lainnya
Lihat Sumber

Abstrak

Useful conversational agents must accurately capture named entities to minimize error for downstream tasks, for example, asking a voice assistant to play a track from a certain artist, initiating navigation to a specific location, or documenting a laboratory result for a patient. However, where named entities such as ``Ukachukwu`` (Igbo), ``Lakicia`` (Swahili), or ``Ingabire`` (Rwandan) are spoken, automatic speech recognition (ASR) models' performance degrades significantly, propagating errors to downstream systems. We model this problem as a distribution shift and demonstrate that such model bias can be mitigated through multilingual pre-training, intelligent data augmentation strategies to increase the representation of African-named entities, and fine-tuning multilingual ASR models on multiple African accents. The resulting fine-tuned models show an 81.5\% relative WER improvement compared with the baseline on samples with African-named entities.

Topik & Kata Kunci

Penulis (7)

T

Tobi Olatunji

T

Tejumade Afonja

B

Bonaventure F. P. Dossou

A

Atnafu Lambebo Tonja

C

Chris Chinenye Emezue

A

Amina Mardiyyah Rufai

S

Sahib Singh

Format Sitasi

Olatunji, T., Afonja, T., Dossou, B.F.P., Tonja, A.L., Emezue, C.C., Rufai, A.M. et al. (2023). AfriNames: Most ASR models "butcher" African Names. https://arxiv.org/abs/2306.00253

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓