arXiv Open Access 2025

Controlling Language Difficulty in Dialogues with Linguistic Features

Shuyao Xu Wenguang Wang Handong Gao Wei Kang Long Qin +1 lainnya

Lihat Sumber

Abstrak

Large language models (LLMs) have emerged as powerful tools for supporting second language acquisition, particularly in simulating interactive dialogues for speaking practice. However, adapting the language difficulty of LLM-generated responses to match learners' proficiency levels remains a challenge. This work addresses this issue by proposing a framework for controlling language proficiency in educational dialogue systems. Our approach leverages three categories of linguistic features, readability features (e.g., Flesch-Kincaid Grade Level), syntactic features (e.g., syntactic tree depth), and lexical features (e.g., simple word ratio), to quantify and regulate text complexity. We demonstrate that training LLMs on linguistically annotated dialogue data enables precise modulation of language proficiency, outperforming prompt-based methods in both flexibility and stability. To evaluate this, we introduce Dilaprix, a novel metric integrating the aforementioned features, which shows strong correlation with expert judgments of language difficulty. Empirical results reveal that our approach achieves superior controllability of language proficiency while maintaining high dialogue quality.

Topik & Kata Kunci

cs.CL

Penulis (6)

Shuyao Xu

Wenguang Wang

Handong Gao

Wei Kang

Long Qin

Weizhi Wang

Format Sitasi

APA MLA BibTeX

Xu, S., Wang, W., Gao, H., Kang, W., Qin, L., Wang, W. (2025). Controlling Language Difficulty in Dialogues with Linguistic Features. https://arxiv.org/abs/2509.14545

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓