Semantic Scholar Open Access 2024 18 sitasi

Large Language Models "Ad Referendum": How Good Are They at Machine Translation in the Legal Domain?

Vicent Briva-Iglesias J. Camargo Gokhan Dogru

Abstrak

This study evaluates the machine translation (MT) quality of two state-of-the-art large language models (LLMs) against a traditional neural machine translation (NMT) system across four language pairs in the legal domain. It combines automatic evaluation metrics (AEMs) and human evaluation (HE) by professional translators to assess translation ranking, fluency and adequacy. The results indicate that while Google Translate generally outperforms LLMs in AEMs, human evaluators rate LLMs, especially GPT-4, comparably or slightly better in terms of producing contextually adequate and fluent translations. This discrepancy suggests LLMs' potential in handling specialized legal terminology and context, highlighting the importance of human evaluation methods in assessing MT quality. The study underscores the evolving capabilities of LLMs in specialized domains and calls for reevaluation of traditional AEMs to better capture the nuances of LLM-generated translations.

Topik & Kata Kunci

Computer Science

Penulis (3)

Vicent Briva-Iglesias

J. Camargo

Gokhan Dogru

Format Sitasi

APA MLA BibTeX

Briva-Iglesias, V., Camargo, J., Dogru, G. (2024). Large Language Models "Ad Referendum": How Good Are They at Machine Translation in the Legal Domain?. https://doi.org/10.48550/arXiv.2402.07681

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.48550/arXiv.2402.07681

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Total Sitasi: 18×
Sumber Database: Semantic Scholar
DOI: 10.48550/arXiv.2402.07681
Akses: Open Access ✓