DOAJ Open Access 2026

Challenges of using generative artificial intelligence for diabetes patient education: a cross-platform analysis of the quality, readability, and actionability of text generated by large language models

Zhiqiang Wang Zhiqiang Wang Xiaoya Li Xianglan Tao Jie Li +3 lainnya

Abstrak

ObjectiveTo compare, across large language model (LLM) platforms, the quality, readability, and completeness of action-oriented instructions in diabetes self-management education texts, and to quantify the associations among these domains to inform model selection and risk mitigation.MethodsTen LLM platforms were used to generate diabetes education texts (total n = 200), stratified by topic. Outcomes included the Global Quality Score (GQS), the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P), and EQIP-36 (Ensuring Quality Information for Patients, 36-item version). Text characteristics, including word count, sentence count, and syllable count, were recorded. Readability was assessed using the Automated Readability Index (ARI), Coleman–Liau Index (CLI), Flesch–Kincaid Grade Level (FKGL), Flesch Reading Ease Score (FRES), Gunning Fog Index (GFOG), Linsear Write (LW), and the Simple Measure of Gobbledygook (SMOG). Between-platform differences were evaluated using one-way ANOVA or the Kruskal–Wallis test, as appropriate. Associations between readability indices and GQS, PEMAT-P, and EQIP-36 were examined using correlation heat maps and exploratory stepwise multiple linear regression. Because the readability indices were highly intercorrelated, these regression analyses were considered exploratory and were used to identify candidate readability-related correlates rather than definitive independent predictors.ResultsGQS and PEMAT-P differed significantly across platforms (both p < 0.001), whereas EQIP-36 did not (p = 0.062). Text length and readability also varied by platform (most p < 0.001). After stratification by topic, PEMAT-P understandability, PEMAT-P total score, and GQS no longer differed significantly across topics (p = 0.356, p = 0.247, and p = 0.182, respectively), whereas PEMAT-P actionability (p < 0.001), EQIP-36 (p < 0.001), and several readability metrics remained significantly different. Difficulty indices were strongly intercorrelated, and FRES was inversely associated with multiple difficulty indices. Exploratory regression analyses suggested that greater reading burden tended to co-occur with lower GQS, PEMAT-P, and EQIP-36 scores.ConclusionLLM-generated diabetes education texts exhibit marked cross-platform heterogeneity, and exploratory analyses suggest a potential trade-off between readability and both information quality and the completeness of action-oriented instructions. Clinical implementation should therefore combine careful platform selection, structured prompting with templates, human–AI review, and continuous quality monitoring to support safe, readable, and actionable patient education.

Topik & Kata Kunci

Penulis (8)

Z

Zhiqiang Wang

Z

Zhiqiang Wang

X

Xiaoya Li

X

Xianglan Tao

J

Jie Li

L

Li Zhang

X

Xiaorong He

J

Jing Yang

Format Sitasi

Wang, Z., Wang, Z., Li, X., Tao, X., Li, J., Zhang, L. et al. (2026). Challenges of using generative artificial intelligence for diabetes patient education: a cross-platform analysis of the quality, readability, and actionability of text generated by large language models. https://doi.org/10.3389/fpubh.2026.1804524

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3389/fpubh.2026.1804524
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.3389/fpubh.2026.1804524
Akses
Open Access ✓