Semantic Scholar Open Access 2021 235 sitasi

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Xiao Pan Mingxuan Wang Liwei Wu Lei Li

Lihat Sumber DOI

Abstrak

Existing multilingual machine translation approaches mainly focus on English-centric directions, while the non-English directions still lag behind. In this work, we aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions. Our intuition is based on the hypothesis that a universal cross-language representation leads to better multilingual translation performance. To this end, we propose mRASP2, a training method to obtain a single unified multilingual translation model. mRASP2 is empowered by two techniques: a) a contrastive learning scheme to close the gap among representations of different languages, and b) data augmentation on both multiple parallel and monolingual data to further align token representations. For English-centric directions, mRASP2 achieves competitive or even better performance than a strong pre-trained model mBART on tens of WMT benchmarks. For non-English directions, mRASP2 achieves an improvement of average 10+ BLEU compared with the multilingual baseline

Topik & Kata Kunci

Computer Science

Penulis (4)

Xiao Pan

Mingxuan Wang

Liwei Wu

Lei Li

Format Sitasi

APA MLA BibTeX

Pan, X., Wang, M., Wu, L., Li, L. (2021). Contrastive Learning for Many-to-many Multilingual Neural Machine Translation. https://doi.org/10.18653/v1/2021.acl-long.21

Akses Cepat

Lihat di Sumber doi.org/10.18653/v1/2021.acl-long.21

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Total Sitasi: 235×
Sumber Database: Semantic Scholar
DOI: 10.18653/v1/2021.acl-long.21
Akses: Open Access ✓