Semantic Scholar Open Access 2023 444 sitasi

Evaluating the Performance of ChatGPT in Ophthalmology

F. Antaki Samir Touma D. Milad J. El-Khoury R. Duval

Abstrak

We tested the accuracy of ChatGPT, a large language model (LLM), in the ophthalmology question-answering space using two popular multiple choice question banks used for the high-stakes Ophthalmic Knowledge Assessment Program (OKAP) exam. The testing sets were of easy-to-moderate difficulty and were diversified, including recall, interpretation, practical and clinical decision-making problems. ChatGPT achieved 55.8% and 42.7% accuracy in the two 260-question simulated exams. Its performance varied across subspecialties, with the best results in general medicine and the worst in neuro-ophthalmology and ophthalmic pathology and intraocular tumors. These results are encouraging but suggest that specialising LLMs through domain-specific pre-training may be necessary to improve their performance in ophthalmic subspecialties.

Topik & Kata Kunci

Penulis (5)

F

F. Antaki

S

Samir Touma

D

D. Milad

J

J. El-Khoury

R

R. Duval

Format Sitasi

Antaki, F., Touma, S., Milad, D., El-Khoury, J., Duval, R. (2023). Evaluating the Performance of ChatGPT in Ophthalmology. https://doi.org/10.1101/2023.01.22.23284882

Akses Cepat

Lihat di Sumber doi.org/10.1101/2023.01.22.23284882
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
444×
Sumber Database
Semantic Scholar
DOI
10.1101/2023.01.22.23284882
Akses
Open Access ✓