Semantic Scholar Open Access 2021 235 sitasi

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation

Xiao Pan Mingxuan Wang Liwei Wu Lei Li

Abstrak

Existing multilingual machine translation approaches mainly focus on English-centric directions, while the non-English directions still lag behind. In this work, we aim to build a many-to-many translation system with an emphasis on the quality of non-English language directions. Our intuition is based on the hypothesis that a universal cross-language representation leads to better multilingual translation performance. To this end, we propose mRASP2, a training method to obtain a single unified multilingual translation model. mRASP2 is empowered by two techniques: a) a contrastive learning scheme to close the gap among representations of different languages, and b) data augmentation on both multiple parallel and monolingual data to further align token representations. For English-centric directions, mRASP2 achieves competitive or even better performance than a strong pre-trained model mBART on tens of WMT benchmarks. For non-English directions, mRASP2 achieves an improvement of average 10+ BLEU compared with the multilingual baseline

Topik & Kata Kunci

Penulis (4)

X

Xiao Pan

M

Mingxuan Wang

L

Liwei Wu

L

Lei Li

Format Sitasi

Pan, X., Wang, M., Wu, L., Li, L. (2021). Contrastive Learning for Many-to-many Multilingual Neural Machine Translation. https://doi.org/10.18653/v1/2021.acl-long.21

Akses Cepat

Informasi Jurnal
Tahun Terbit
2021
Bahasa
en
Total Sitasi
235×
Sumber Database
Semantic Scholar
DOI
10.18653/v1/2021.acl-long.21
Akses
Open Access ✓