Designing AI-powered translation education tools: a framework for parallel sentence generation using SauLTC and LLMs
Abstrak
Translation education (TE) demands significant effort from educators due to its labor-intensive nature. Developing computational tools powered by artificial intelligence (AI) can alleviate this burden by automating repetitive tasks, allowing instructors to focus on higher-level pedagogical aspects of translation. This integration of AI has the potential to significantly enhance the efficiency and effectiveness of translation education. The development of effective AI-based tools for TE is hampered by a lack of high-quality, comprehensive datasets tailored to this specific need, especially for Arabic. While the Saudi Learner Translation Corpus (SauLTC), a unidirectional English-to-Arabic parallel corpus, constitutes a valuable resource, its current format is inadequate for generating the parallel sentences required for a didactic translation corpus. This article proposes leveraging large language models like the Generative Pre-trained Transformer (GPT) to transform SauLTC into a parallel sentence corpus. Using cosine similarity and human evaluation, we assessed the quality of the generated parallel sentences, achieving promising results with an 85.2% similarity score using Language-agnostic BERT Sentence Embedding (LaBSE) in conjunction with GPT, outperforming other investigated embedding models. The results demonstrate the potential of AI to address critical dataset challenges in quest of effective data driven solutions to support translation education.
Penulis (8)
Moneerh Aleedy
Fatma Alshihri
Souham Meshoul
Maha Al-Harthi
Salwa Alramlawi
Badr Aldaihani
Hadil Shaiba
Eric Atwell
Akses Cepat
- Tahun Terbit
- 2025
- Bahasa
- en
- Total Sitasi
- 6×
- Sumber Database
- CrossRef
- DOI
- 10.7717/peerj-cs.2788
- Akses
- Open Access ✓