arXiv Open Access 2025

Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation

Behzad Shayegh Jan-Thorsten Peter David Vilar Tobias Domhan Juraj Juraska +2 lainnya
Lihat Sumber

Abstrak

We investigate the tradeoff between adequacy and fluency in machine translation. We show the severity of this tradeoff at the evaluation level and analyze where popular metrics fall within it. Essentially, current metrics generally lean toward adequacy, meaning that their scores correlate more strongly with the adequacy of translations than with fluency. More importantly, we find that this tradeoff also persists at the meta-evaluation level, and that the standard WMT meta-evaluation favors adequacy-oriented metrics over fluency-oriented ones. We show that this bias is partially attributed to the composition of the systems included in the meta-evaluation datasets. To control this bias, we propose a method that synthesizes translation systems in meta-evaluation. Our findings highlight the importance of understanding this tradeoff in meta-evaluation and its impact on metric rankings.

Topik & Kata Kunci

Penulis (7)

B

Behzad Shayegh

J

Jan-Thorsten Peter

D

David Vilar

T

Tobias Domhan

J

Juraj Juraska

M

Markus Freitag

L

Lili Mou

Format Sitasi

Shayegh, B., Peter, J., Vilar, D., Domhan, T., Juraska, J., Freitag, M. et al. (2025). Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation. https://arxiv.org/abs/2509.20287

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓