arXiv Open Access 2025

A New NMT Model for Translating Clinical Texts from English to Spanish

Rumeng Li Xun Wang Hong Yu
Lihat Sumber

Abstrak

Translating electronic health record (EHR) narratives from English to Spanish is a clinically important yet challenging task due to the lack of a parallel-aligned corpus and the abundant unknown words contained. To address such challenges, we propose \textbf{NOOV} (for No OOV), a new neural machine translation (NMT) system that requires little in-domain parallel-aligned corpus for training. NOOV integrates a bilingual lexicon automatically learned from parallel-aligned corpora and a phrase look-up table extracted from a large biomedical knowledge resource, to alleviate both the unknown word problem and the word-repeat challenge in NMT, enhancing better phrase generation of NMT systems. Evaluation shows that NOOV is able to generate better translation of EHR with improvement in both accuracy and fluency.

Topik & Kata Kunci

Penulis (3)

R

Rumeng Li

X

Xun Wang

H

Hong Yu

Format Sitasi

Li, R., Wang, X., Yu, H. (2025). A New NMT Model for Translating Clinical Texts from English to Spanish. https://arxiv.org/abs/2508.18607

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓