arXiv Open Access 2025

SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?

Senyu Li Jiayi Wang Felermino D. M. A. Ali Colin Cherry Daniel Deutsch +5 lainnya
Lihat Sumber

Abstrak

Evaluating machine translation (MT) quality for under-resourced African languages remains a significant challenge, as existing metrics often suffer from limited language coverage and poor performance in low-resource settings. While recent efforts, such as AfriCOMET, have addressed some of the issues, they are still constrained by small evaluation sets, a lack of publicly available training data tailored to African languages, and inconsistent performance in extremely low-resource scenarios. In this work, we introduce SSA-MTE, a large-scale human-annotated MT evaluation (MTE) dataset covering 14 African language pairs from the News domain, with over 73,000 sentence-level annotations from a diverse set of MT systems. Based on this data, we develop SSA-COMET and SSA-COMET-QE, improved reference-based and reference-free evaluation metrics. We also benchmark prompting-based approaches using state-of-the-art LLMs like GPT-4o, Claude-3.7 and Gemini 2.5 Pro. Our experimental results show that SSA-COMET models significantly outperform AfriCOMET and are competitive with the strongest LLM Gemini 2.5 Pro evaluated in our study, particularly on low-resource languages such as Twi, Luo, and Yoruba. All resources are released under open licenses to support future research.

Topik & Kata Kunci

Penulis (10)

S

Senyu Li

J

Jiayi Wang

F

Felermino D. M. A. Ali

C

Colin Cherry

D

Daniel Deutsch

E

Eleftheria Briakou

R

Rui Sousa-Silva

H

Henrique Lopes Cardoso

P

Pontus Stenetorp

D

David Ifeoluwa Adelani

Format Sitasi

Li, S., Wang, J., Ali, F.D.M.A., Cherry, C., Deutsch, D., Briakou, E. et al. (2025). SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?. https://arxiv.org/abs/2506.04557

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓