DOAJ Open Access 2026

Expert-AI Concordance in Varicocele Management: How Reliable Is ChatGPT-4.0?

Fahri Yavuz İlki Emre Bülbül Yusuf Kadir Topçu Selahattin Bedir

Abstrak

Objective: Artificial intelligence (AI)-based large language models (LLMs), such as ChatGPT-4.0, are increasingly being considered for clinical decision-making support. However, their reliability in providing clinical recommendations for varicocele-related infertility remains to be thoroughly evaluated. This study aimed to evaluate the reliability of ChatGPT-4.0 in providing clinical recommendations for patients with varicocele-related infertility. Materials and Methods: A standardized clinical scenario was created involving a 32-year-old male with varicocele and oligoasthenoteratozoospermia, including details from physical examination, hormonal profile, and semen analysis based on the World Health Organization 6th edition criteria. Sixteen diagnostic and therapeutic questions were developed and submitted to ChatGPT-4.0. The AI-generated responses were reviewed by 24 experienced urologists specializing in varicocele management, who rated the recommendations using a 5-point Likert scale. Results: The urologists demonstrated an 80.2% agreement, 10.7% disagreement, and 9.1% neutrality with ChatGPT-4.0 recommendations. For 14 of the 16 questions, the majority of urologists either agreed or strongly agreed with ChatGPT-4.0. Recommendations regarding varicocelectomy indication, antioxidant usage, the female partner age greater than 35, follow-up after varicocelectomy, testosterone deficiency, and normospermic varicocele showed the highest consensus. However, lower agreement rates were noted for microsurgical varicocelectomy (54.1%) and preoperative sperm cryopreservation (16.7%). Conclusion: ChatGPT-4.0 demonstrates reliability in providing clinical recommendations in most scenarios related to varicocele treatment, showing strong agreement with expert clinicians. However, specific “gray zone” scenarios requiring individualized decision-making highlight limitations; emphasizing the importance of experienced clinical judgment. ChatGPT-4.0 can serve as a reliable informational tool regarding varicocele treatment but should be used with caution in complex clinical decisions requiring personalized evaluation.

Penulis (4)

F

Fahri Yavuz İlki

E

Emre Bülbül

Y

Yusuf Kadir Topçu

S

Selahattin Bedir

Format Sitasi

İlki, F.Y., Bülbül, E., Topçu, Y.K., Bedir, S. (2026). Expert-AI Concordance in Varicocele Management: How Reliable Is ChatGPT-4.0?. https://doi.org/10.4274/jus.galenos.2025.2025-7-19

Akses Cepat

Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.4274/jus.galenos.2025.2025-7-19
Akses
Open Access ✓