DOAJ Open Access 2025

Assessing the adherence of large language models to clinical practice guidelines in Chinese medicine: a content analysis

Weilong Zhao Honghao Lai Bei Pan Jiajie Huang Danni Xia +18 lainnya

Abstrak

ObjectiveWhether large language models (LLMs) can effectively facilitate CM knowledge acquisition remains uncertain. This study aims to assess the adherence of LLMs to Clinical Practice Guidelines (CPGs) in CM.MethodsThis cross-sectional study randomly selected ten CPGs in CM and constructed 150 questions across three categories: medication based on differential diagnosis (MDD), specific prescription consultation (SPC), and CM theory analysis (CTA). Eight LLMs (GPT-4o, Claude-3.5 Sonnet, Moonshot-v1, ChatGLM-4, DeepSeek-v3, DeepSeek-r1, Claude-4 sonnet, and Claude-4 sonnet thinking) were evaluated using both English and Chinese queries. The main evaluation metrics included accuracy, readability, and use of safety disclaimers.ResultsOverall, DeepSeek-v3 and DeepSeek-r1 demonstrated superior performance in both English (median 5.00, interquartile range (IQR) 4.00–5.00 vs. median 5.00, IQR 3.70–5.00) and Chinese (both median 5.00, IQR 4.30–5.00), significantly outperforming all other models. All models achieved significantly higher accuracy in Chinese versus English responses (all p < 0.05). Significant variations in accuracy were observed across the categories of questions, with MDD and SPC questions presenting more challenges than CTA questions. English responses had lower readability (mean flesch reading ease score 32.7) compared to Chinese responses. Moonshot-v1 provided the highest rate of safety disclaimers (98.7% English, 100% Chinese).ConclusionLLMs showed varying degrees of potential for acquiring CM knowledge. The performance of DeepSeek-v3 and DeepSeek-r1 is satisfactory. Optimizing LLMs to become effective tools for disseminating CM information is an important direction for future development.

Topik & Kata Kunci

Penulis (23)

W

Weilong Zhao

H

Honghao Lai

B

Bei Pan

J

Jiajie Huang

D

Danni Xia

C

Chunyang Bai

J

Jiayi Liu

J

Jiayi Liu

J

Jianing Liu

Y

Yinghui Jin

H

Hongcai Shang

J

Jianping Liu

N

Nannan Shi

J

Jie Liu

Y

Yaolong Chen

Y

Yaolong Chen

Y

Yaolong Chen

Y

Yaolong Chen

J

Janne Estill

J

Janne Estill

L

Long Ge

L

Long Ge

L

Long Ge

Format Sitasi

Zhao, W., Lai, H., Pan, B., Huang, J., Xia, D., Bai, C. et al. (2025). Assessing the adherence of large language models to clinical practice guidelines in Chinese medicine: a content analysis. https://doi.org/10.3389/fphar.2025.1649041

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3389/fphar.2025.1649041
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.3389/fphar.2025.1649041
Akses
Open Access ✓