DOAJ Open Access 2026

Comparison of large language models and expert multidisciplinary team decisions in colorectal cancer

Lei Huang Chen Wu Tingting Sun Xiaotong Hou Aiwen Wu +5 lainnya

Abstrak

Objectives To evaluate the ability of large language models (LLMs) to simulate multidisciplinary team (MDT) decision-making in colorectal cancer, a malignancy that often requires complex treatment planning.Methods We retrospectively analysed 1423 colorectal cancer cases discussed at MDT meetings at Peking University Cancer Hospital between January 2023 and December 2024. Three LLMs—OpenAI o3-mini-2025-01-31, DeepSeek-R1 671b and Qwen qwq-plus-2025-03-05—were tested for their ability to replicate MDT recommendations using a standardised treatment categorisation framework. Each case was processed three times per model; only cases with consistent outputs across all three runs were included. Concordance between AI-generated decisions and expert MDT consensus was assessed using agreement percentages and Cohen’s kappa.Results O3 demonstrated the highest intramodel stability, with an agreement rate of 81.0% (Fleiss’ kappa=0.794), yielding 1153 cases with consistent outputs. Concordance with MDT consensus was comparable across the three models, ranging from 62.5% to 65.4%. Multivariable analysis of O3 outputs identified treatment-naïve status, non-metastatic disease and colon tumour location as independent predictors of higher concordance with experts.Discussion LLMs showed fair overall agreement with expert MDT decisions, with stronger performance in standardised and less complex clinical scenarios. Areas of higher concordance included treatment-naïve non-metastatic colon cancer, treated non-metastatic rectal cancer and treated non-metastatic colon cancer.Conclusion LLMs can partially replicate expert MDT recommendations in colorectal cancer. Their integration into clinical workflows should aim to complement, rather than replace, human expertise.

Penulis (10)

L

Lei Huang

C

Chen Wu

T

Tingting Sun

X

Xiaotong Hou

A

Aiwen Wu

D

Dawei Li

B

Boyang Qu

L

Longhao Cao

Y

Yongjiu Chen

J

Junpeng Pei

Format Sitasi

Huang, L., Wu, C., Sun, T., Hou, X., Wu, A., Li, D. et al. (2026). Comparison of large language models and expert multidisciplinary team decisions in colorectal cancer. https://doi.org/10.1136/bmjhci-2025-101780

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1136/bmjhci-2025-101780
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.1136/bmjhci-2025-101780
Akses
Open Access ✓