arXiv Open Access 2024

QueEn: A Large Language Model for Quechua-English Translation

Junhao Chen Peng Shu Yiwei Li Huaqin Zhao Hanqi Jiang +5 lainnya

Lihat Sumber

Abstrak

Recent studies show that large language models (LLMs) are powerful tools for working with natural language, bringing advances in many areas of computational linguistics. However, these models face challenges when applied to low-resource languages due to limited training data and difficulty in understanding cultural nuances. In this paper, we propose QueEn, a novel approach for Quechua-English translation that combines Retrieval-Augmented Generation (RAG) with parameter-efficient fine-tuning techniques. Our method leverages external linguistic resources through RAG and uses Low-Rank Adaptation (LoRA) for efficient model adaptation. Experimental results show that our approach substantially exceeds baseline models, with a BLEU score of 17.6 compared to 1.5 for standard GPT models. The integration of RAG with fine-tuning allows our system to address the challenges of low-resource language translation while maintaining computational efficiency. This work contributes to the broader goal of preserving endangered languages through advanced language technologies.

Topik & Kata Kunci

cs.CL cs.AI

Penulis (10)

Junhao Chen

Peng Shu

Yiwei Li

Huaqin Zhao

Hanqi Jiang

Yi Pan

Yifan Zhou

Zhengliang Liu

Lewis C Howe

Tianming Liu

Format Sitasi

APA MLA BibTeX

Chen, J., Shu, P., Li, Y., Zhao, H., Jiang, H., Pan, Y. et al. (2024). QueEn: A Large Language Model for Quechua-English Translation. https://arxiv.org/abs/2412.05184

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓