arXiv Open Access 2022

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Kshitij Gupta

Lihat Sumber

Abstrak

Large pre-trained language models have brought remarkable progress in NLP. Pre-training and Fine-tuning have given state-of-art performance across tasks in text processing. Data Augmentation techniques have also helped build state-of-art models on low or zero resource tasks. Many works in the past have attempted at learning a single massively-multilingual machine translation model for zero-shot translation. Although those translation models are producing correct translations, the main challenge is those models are producing the wrong languages for zero-shot translation. This work and its results indicate that prompt conditioned large models do not suffer from off-target language errors i.e. errors arising due to translation to wrong languages. We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation.

Topik & Kata Kunci

cs.CL cs.LG

Penulis (1)

Kshitij Gupta

Format Sitasi

APA MLA BibTeX

Gupta, K. (2022). MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation. https://arxiv.org/abs/2210.00320

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2022
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓