{"results":[{"id":"ss_2d9597a69e58b27ffb8531abb885da62432666d3","title":"Artificial intelligence and deep learning in ophthalmology","authors":[{"name":"D. Ting"},{"name":"L. Pasquale"},{"name":"L. Peng"},{"name":"J. P. Campbell"},{"name":"Aaron Y. Lee"},{"name":"R. Raman"},{"name":"G. Tan"},{"name":"L. Schmetterer"},{"name":"P. Keane"},{"name":"T. Wong"}],"abstract":"Artificial intelligence (AI) based on deep learning (DL) has sparked tremendous global interest in recent years. DL has been widely adopted in image recognition, speech recognition and natural language processing, but is only beginning to impact on healthcare. In ophthalmology, DL has been applied to fundus photographs, optical coherence tomography and visual fields, achieving robust classification performance in the detection of diabetic retinopathy and retinopathy of prematurity, the glaucoma-like disc, macular oedema and age-related macular degeneration. DL in ocular imaging may be used in conjunction with telemedicine as a possible solution to screen, diagnose and monitor major eye diseases for patients in primary care and community settings. Nonetheless, there are also potential challenges with DL application in ophthalmology, including clinical and technical challenges, explainability of the algorithm results, medicolegal issues, and physician and patient acceptance of the AI ‘black-box’ algorithms. DL could potentially revolutionise how ophthalmology is practised in the future. This review provides a summary of the state-of-the-art DL systems described for ophthalmic applications, potential challenges in clinical deployment and the path forward.","source":"Semantic Scholar","year":2018,"language":"en","subjects":["Medicine"],"doi":"10.1136/bjophthalmol-2018-313173","url":"https://www.semanticscholar.org/paper/2d9597a69e58b27ffb8531abb885da62432666d3","pdf_url":"https://bjo.bmj.com/content/bjophthalmol/103/2/167.full.pdf","is_open_access":true,"citations":1209,"published_at":"","score":92},{"id":"ss_35cdc00a4e2bc1f6c253a34a5e3a6f697050d1e9","title":"Evaluating the Performance of ChatGPT in Ophthalmology","authors":[{"name":"F. Antaki"},{"name":"Samir Touma"},{"name":"D. Milad"},{"name":"J. El-Khoury"},{"name":"R. Duval"}],"abstract":"We tested the accuracy of ChatGPT, a large language model (LLM), in the ophthalmology question-answering space using two popular multiple choice question banks used for the high-stakes Ophthalmic Knowledge Assessment Program (OKAP) exam. The testing sets were of easy-to-moderate difficulty and were diversified, including recall, interpretation, practical and clinical decision-making problems. ChatGPT achieved 55.8% and 42.7% accuracy in the two 260-question simulated exams. Its performance varied across subspecialties, with the best results in general medicine and the worst in neuro-ophthalmology and ophthalmic pathology and intraocular tumors. These results are encouraging but suggest that specialising LLMs through domain-specific pre-training may be necessary to improve their performance in ophthalmic subspecialties.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1101/2023.01.22.23284882","url":"https://www.semanticscholar.org/paper/35cdc00a4e2bc1f6c253a34a5e3a6f697050d1e9","pdf_url":"http://www.ophthalmologyscience.org/article/S2666914523000568/pdf","is_open_access":true,"citations":444,"published_at":"","score":80.32},{"id":"ss_be19cc6c4d147ced1f869ebc0b4a644029d87457","title":"Digital technology, tele-medicine and artificial intelligence in ophthalmology: A global perspective","authors":[{"name":"Ji-Peng Olivia Li"},{"name":"Hanruo Liu"},{"name":"D. S. Ting"},{"name":"S. Jeon"},{"name":"R. Chan"},{"name":"Judy E. Kim"},{"name":"Dawn Sim"},{"name":"Peter Thomas"},{"name":"Haotian Lin"},{"name":"Youxin Chen"},{"name":"Taiji Sakomoto"},{"name":"A. Loewenstein"},{"name":"Dennis S. C. Lam"},{"name":"L. Pasquale"},{"name":"Tien Yin Wong"},{"name":"L. Lam"},{"name":"Daniel S. W. Ting"}],"abstract":"The simultaneous maturation of multiple digital and telecommunications technologies in 2020 has created an unprecedented opportunity for ophthalmology to adapt to new models of care using tele-health supported by digital innovations. These digital innovations include artificial intelligence (AI), 5th generation (5G) telecommunication networks and the Internet of Things (IoT), creating an inter-dependent ecosystem offering opportunities to develop new models of eye care addressing the challenges of COVID-19 and beyond. Ophthalmology has thrived in some of these areas partly due to its many image-based investigations. Tele-health and AI provide synchronous solutions to challenges facing ophthalmologists and healthcare providers worldwide. This article reviews how countries across the world have utilised these digital innovations to tackle diabetic retinopathy, retinopathy of prematurity, age-related macular degeneration, glaucoma, refractive error correction, cataract and other anterior segment disorders. The review summarises the digital strategies that countries are developing and discusses technologies that may increasingly enter the clinical workflow and processes of ophthalmologists. Furthermore as countries around the world have initiated a series of escalating containment and mitigation measures during the COVID-19 pandemic, the delivery of eye care services globally has been significantly impacted. As ophthalmic services adapt and form a “new normal”, the rapid adoption of some of telehealth and digital innovation during the pandemic is also discussed. Finally, challenges for validation and clinical implementation are considered, as well as recommendations on future directions.","source":"Semantic Scholar","year":2020,"language":"en","subjects":["Business","Medicine"],"doi":"10.1016/j.preteyeres.2020.100900","url":"https://www.semanticscholar.org/paper/be19cc6c4d147ced1f869ebc0b4a644029d87457","pdf_url":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7474840","is_open_access":true,"citations":501,"published_at":"","score":79.03},{"id":"ss_367d927b98fe9ea2aec89486dc89cacfc72bfb70","title":"Targeting angiogenesis in oncology, ophthalmology and beyond","authors":[{"name":"Yihai Cao"},{"name":"R. Langer"},{"name":"N. Ferrara"}],"abstract":"","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1038/s41573-023-00671-z","url":"https://www.semanticscholar.org/paper/367d927b98fe9ea2aec89486dc89cacfc72bfb70","is_open_access":true,"citations":225,"published_at":"","score":73.75},{"id":"ss_25696a83ed76a5f93e3301034269cb1f0411a387","title":"Artificial intelligence in ophthalmology: The path to the real-world clinic","authors":[{"name":"Zhongwen Li"},{"name":"Lei Wang"},{"name":"Xuefang Wu"},{"name":"Jiewei Jiang"},{"name":"Wei Qiang"},{"name":"He Xie"},{"name":"Hongjian Zhou"},{"name":"Shanjun Wu"},{"name":"Yi Shao"},{"name":"Wei Chen"}],"abstract":"Summary Artificial intelligence (AI) has great potential to transform healthcare by enhancing the workflow and productivity of clinicians, enabling existing staff to serve more patients, improving patient outcomes, and reducing health disparities. In the field of ophthalmology, AI systems have shown performance comparable with or even better than experienced ophthalmologists in tasks such as diabetic retinopathy detection and grading. However, despite these quite good results, very few AI systems have been deployed in real-world clinical settings, challenging the true value of these systems. This review provides an overview of the current main AI applications in ophthalmology, describes the challenges that need to be overcome prior to clinical implementation of the AI systems, and discusses the strategies that may pave the way to the clinical translation of these systems.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1016/j.xcrm.2023.101095","url":"https://www.semanticscholar.org/paper/25696a83ed76a5f93e3301034269cb1f0411a387","pdf_url":"http://www.cell.com/article/S2666379123002148/pdf","is_open_access":true,"citations":146,"published_at":"","score":71.38},{"id":"ss_9ca1eddc6f6ff8881da6492ea7e9b383e04c2276","title":"ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes","authors":[{"name":"Swati Singh"},{"name":"A. Djalilian"},{"name":"M. Ali"}],"abstract":"ABSTRACT Purpose This study aimed to report the abilities of the large language model ChatGPTR (OpenAI, San Francisco, USA) in constructing ophthalmic discharge summaries and operative notes. Methods A set of prompts was constructed through statements incorporating common ophthalmic surgeries across the subspecialties of the cornea, retina, glaucoma, paediatric ophthalmology, neuro-ophthalmology, and ophthalmic plastics surgery. The responses of ChatGPT were assessed by three surgeons carefully and analyzed them for evidence-based content, specificity of the response, presence of generic text, disclaimers, factual inaccuracies, and its abilities to admit mistakes and challenge incorrect premises. Results A total of 24 prompts were presented to the ChatGPT. Twelve prompts assessed its ability to construct discharge summaries, and an equal number explored the potential for preparing operative notes. The response was found to be tailored based on the quality of inputs given and was provided in a matter of seconds. The ophthalmic discharge summaries had a valid but significant generic text. ChatGPT could incorporate specific medications, follow-up instructions, consultation time, and location within the discharge summaries when prompted appropriately. While the operative notes were detailed, they required significant tuning. ChatGPT routinely admits its mistakes and corrects itself immediately when confronted with factual inaccuracies. The mistakes are avoided in subsequent reports when given similar prompts. Conclusion The performance of ChatGPT in the context of ophthalmic discharge summaries and operative notes was encouraging. These are constructed rapidly in a matter of seconds. Focused training of ChatGPT on these issues with inclusion of a human verification step has an enormous potential to impact healthcare positively.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1080/08820538.2023.2209166","url":"https://www.semanticscholar.org/paper/9ca1eddc6f6ff8881da6492ea7e9b383e04c2276","is_open_access":true,"citations":131,"published_at":"","score":70.93},{"id":"ss_7ef0cc95ff71c1098414cd61e88ac373ea2db4c4","title":"Performance of Generative Large Language Models on Ophthalmology Board Style Questions.","authors":[{"name":"Louis Z Cai"},{"name":"Abdulla R. Shaheen"},{"name":"Andrew C. Jin"},{"name":"Riya Fukui"},{"name":"Jonathan S. Yi"},{"name":"Nicolas A. Yannuzzi"},{"name":"C. Alabiad"}],"abstract":"PURPOSE To investigate the ability of generative artificial intelligence models to answer ophthalmology board style questions DESIGN: Experimental study. METHODS This study evaluated three large language models (LLMs) with chat interfaces, Bing Chat (Microsoft) and ChatGPT 3.5 and 4.0 (OpenAI), using 250 questions from the Basic Science and Clinical Science (BCSC) Self-Assessment Program (SAP). While ChatGPT is trained on information last updated in 2021, Bing Chat incorporates more recently indexed internet search to generate its answers. Performance was compared to human respondents. Questions were categorized by complexity and patient care phase, and instances of information fabrication or non-logical reasoning were documented. MAIN OUTCOME MEASURES Primary outcome: response accuracy. SECONDARY OUTCOMES performance in question subcategories and hallucination frequency. RESULTS Human respondents had an average accuracy of 72.2%. ChatGPT-3.5 scored the lowest (58.8%), while ChatGPT-4.0 (71.6%) and Bing Chat (71.2%) performed comparably. ChatGPT-4.0 excelled in workup-type questions (OR = 3.89, 95% CI 1.19-14.73, p = 0.03) compared with diagnostic questions, but struggled with image interpretation (OR = 0.14, 95% CI 0.05-0.33, p \u003c 0.01) when compared with single step reasoning questions. Against single step questions, Bing Chat also faced difficulties with image interpretation (OR = 0.18, 95% CI 0.08-0.44, p \u003c 0.01) and multi-step reasoning (OR = 0.30, 95% CI 0.11-0.84, p = 0.02). ChatGPT-3.5 had the highest rate of hallucinations or non-logical reasoning (42.4%), followed by ChatGPT-4.0 (18.0%) and Bing Chat (25.6%). CONCLUSIONS LLMs (particularly ChatGPT-4.0 and Bing Chat) can perform similarly with human respondents answering questions from the BCSC SAP. The frequency of hallucinations and non-logical reasoning suggest room for improvement in the performance of conversational agents in the medical domain.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1016/j.ajo.2023.05.024","url":"https://www.semanticscholar.org/paper/7ef0cc95ff71c1098414cd61e88ac373ea2db4c4","is_open_access":true,"citations":118,"published_at":"","score":70.53999999999999},{"id":"ss_e191380320f6514e1a1e065d967455670c187c06","title":"Artificial Intelligence in Ophthalmology: A Comparative Analysis of GPT-3.5, GPT-4, and Human Expertise in Answering StatPearls Questions","authors":[{"name":"M. Moshirfar"},{"name":"Amal W Altaf"},{"name":"Isabella M. Stoakes"},{"name":"Jared J. Tuttle"},{"name":"P. Hoopes"}],"abstract":"Importance Chat Generative Pre-Trained Transformer (ChatGPT) has shown promising performance in various fields, including medicine, business, and law, but its accuracy in specialty-specific medical questions, particularly in ophthalmology, is still uncertain. Purpose This study evaluates the performance of two ChatGPT models (GPT-3.5 and GPT-4) and human professionals in answering ophthalmology questions from the StatPearls question bank, assessing their outcomes, and providing insights into the integration of artificial intelligence (AI) technology in ophthalmology. Methods ChatGPT's performance was evaluated using 467 ophthalmology questions from the StatPearls question bank. These questions were stratified into 11 subcategories, four difficulty levels, and three generalized anatomical categories. The answer accuracy of GPT-3.5, GPT-4, and human participants was assessed. Statistical analysis was conducted via the Kolmogorov-Smirnov test for normality, one-way analysis of variance (ANOVA) for the statistical significance of GPT-3 versus GPT-4 versus human performance, and repeated unpaired two-sample t-tests to compare the means of two groups. Results GPT-4 outperformed both GPT-3.5 and human professionals on ophthalmology StatPearls questions, except in the \"Lens and Cataract\" category. The performance differences were statistically significant overall, with GPT-4 achieving higher accuracy (73.2%) compared to GPT-3.5 (55.5%, p-value \u003c 0.001) and humans (58.3%, p-value \u003c 0.001). There were variations in performance across difficulty levels (rated one to four), but GPT-4 consistently performed better than both GPT-3.5 and humans on level-two, -three, and -four questions. On questions of level-four difficulty, human performance significantly exceeded that of GPT-3.5 (p = 0.008). Conclusion The study's findings demonstrate GPT-4's significant performance improvements over GPT-3.5 and human professionals on StatPearls ophthalmology questions. Our results highlight the potential of advanced conversational AI systems to be utilized as important tools in the education and practice of medicine.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.7759/cureus.40822","url":"https://www.semanticscholar.org/paper/e191380320f6514e1a1e065d967455670c187c06","pdf_url":"https://assets.cureus.com/uploads/original_article/pdf/164004/20230622-12768-13bvy5b.pdf","is_open_access":true,"citations":116,"published_at":"","score":70.47999999999999},{"id":"ss_9ba72fc34851c330f94b156def9e109f2c30d008","title":"Large language models and their impact in ophthalmology.","authors":[{"name":"B. Betzler"},{"name":"Haichao Chen"},{"name":"Ching-Yu Cheng"},{"name":"Cecilia S. Lee"},{"name":"Guochen Ning"},{"name":"Su Jeong Song"},{"name":"Aaron Y. Lee"},{"name":"Ryo Kawasaki"},{"name":"Peter van Wijngaarden"},{"name":"Andrzej Grzybowski"},{"name":"Mingguang He"},{"name":"Dawei Li"},{"name":"An Ran Ran"},{"name":"D. Ting"},{"name":"K. Teo"},{"name":"Paisan Ruamviboonsuk"},{"name":"S. Sivaprasad"},{"name":"V. Chaudhary"},{"name":"R. Tadayoni"},{"name":"Xiaofei Wang"},{"name":"C. Y. Cheung"},{"name":"Yingfeng Zheng"},{"name":"Ya Xing Wang"},{"name":"Y. Tham"},{"name":"T. Y. Wong"}],"abstract":"The advent of generative artificial intelligence and large language models has ushered in transformative applications within medicine. Specifically in ophthalmology, large language models offer unique opportunities to revolutionise digital eye care, address clinical workflow inefficiencies, and enhance patient experiences across diverse global eye care landscapes. Yet alongside these prospects lie tangible and ethical challenges, encompassing data privacy, security, and the intricacies of embedding large language models into clinical routines. This Viewpoint highlights the promising applications of large language models in ophthalmology, while weighing up the practical and ethical barriers towards their real-world implementation. This Viewpoint seeks to stimulate broader discourse on the potential of large language models in ophthalmology and to galvanise both clinicians and researchers into tackling the prevailing challenges and optimising the benefits of large language models while curtailing the associated risks.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1016/s2589-7500(23)00201-7","url":"https://www.semanticscholar.org/paper/9ba72fc34851c330f94b156def9e109f2c30d008","pdf_url":"http://www.thelancet.com/article/S2589750023002017/pdf","is_open_access":true,"citations":112,"published_at":"","score":70.36},{"id":"arxiv_2603.23953","title":"VOLMO: Versatile and Open Large Models for Ophthalmology","authors":[{"name":"Zhenyue Qin"},{"name":"Younjoon Chung"},{"name":"Elijah Lee"},{"name":"Wanyue Feng"},{"name":"Xuguang Ai"},{"name":"Serina Applebaum"},{"name":"Minjie Zou"},{"name":"Yang Liu"},{"name":"Pan Xiao"},{"name":"Mac Singer"},{"name":"Amisha Dave"},{"name":"Aidan Gilson"},{"name":"Tiarnan D. L. Keenan"},{"name":"Emily Y. Chew"},{"name":"Zhiyong Lu"},{"name":"Yih-Chung Tham"},{"name":"Ron Adelman"},{"name":"Luciano V. Del Priore"},{"name":"Qingyu Chen"}],"abstract":"Vision impairment affects millions globally, and early detection is critical to preventing irreversible vision loss. Ophthalmology workflows require clinicians to integrate medical images, structured clinical data, and free-text notes to determine disease severity and management, which is time-consuming and burdensome. Recent multimodal large language models (MLLMs) show promise, but existing general and medical MLLMs perform poorly in ophthalmology, and few ophthalmology-specific MLLMs are openly available. We present VOLMO (Versatile and Open Large Models for Ophthalmology), a model-agnostic, data-open framework for developing ophthalmology-specific MLLMs. VOLMO includes three stages: ophthalmology knowledge pretraining on 86,965 image-text pairs from 26,569 articles across 82 journals; domain task fine-tuning on 26,929 annotated instances spanning 12 eye conditions for disease screening and severity classification; and multi-step clinical reasoning on 913 patient case reports for assessment, planning, and follow-up care. Using this framework, we trained a compact 2B-parameter MLLM and compared it with strong baselines, including InternVL-2B, LLaVA-Med-7B, MedGemma-4B, MedGemma-27B, and RETFound. We evaluated these models on image description generation, disease screening and staging classification, and assessment-and-management generation, with additional manual review by two healthcare professionals and external validation on three independent cohorts for age-related macular degeneration and diabetic retinopathy. Across settings, VOLMO-2B consistently outperformed baselines, achieving stronger image description performance, an average F1 of 87.4% across 12 eye conditions, and higher scores in external validation.","source":"arXiv","year":2026,"language":"en","subjects":["cs.CV","cs.ET"],"url":"https://arxiv.org/abs/2603.23953","pdf_url":"https://arxiv.org/pdf/2603.23953","is_open_access":true,"published_at":"2026-03-25T05:25:10Z","score":70},{"id":"ss_663fbe7bf9e99a1489772981d72e8097d70b5853","title":"Ophthalmology Workforce Projections in the United States, 2020-2035.","authors":[{"name":"S. Berkowitz"},{"name":"Avni P. Finn"},{"name":"Ravi Parikh"},{"name":"Ajay E. Kuriyan"},{"name":"Shriji N. Patel"}],"abstract":"OBJECTIVE To analyze ophthalmology workforce supply and demand projections from 2020 to 2035 DESIGN: Observational cohort studying using data from the National Center for Health Workforce Analysis (NCHWA) METHODS: Data accessed from the Department of Health and Human Services, Health Resources and Services Administration (HRSA) website were compiled to analyze the workforce supply and demand projections for ophthalmologists from 2020 to 2035. RESULTS From 2020 to 2035, the total ophthalmology supply is projected to decrease by 2,650 full-time equivalent ophthalmologists (FTE) (12% decline), and total demand projected to increase by 5,150 FTE (24% increase), representing a supply and demand mismatch of 30% workforce inadequacy. The level of inadequacy was markedly different based on rurality by year 2035 with 77% compared to 29% workforce adequacy in metro and nonmetro geographies, respectively. By year 2035, ophthalmology is projected to have the second lowest rate of workforce adequacy (70%) out of 38 medical and surgical specialties studied. CONCLUSIONS The HRSA's Health Workforce Simulation Model forecasts a sizeable shortage of ophthalmology supply relative to demand by year 2035, with substantial geographic disparities. Ophthalmology is one of the medical specialties with the lowest rate of projected workforce adequacy by 2035. Further dedicated workforce supply and demand research for ophthalmology and allied professionals is needed to validate these projections and may have significant future implications for patients and providers.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1016/j.ophtha.2023.09.018","url":"https://www.semanticscholar.org/paper/663fbe7bf9e99a1489772981d72e8097d70b5853","is_open_access":true,"citations":88,"published_at":"","score":69.64},{"id":"ss_2054fbdd087092b54a4b7b1f3cb07d467ab1b0d5","title":"Optical Coherence Tomography (OCT): A Brief Look at the Uses and Technological Evolution of Ophthalmology","authors":[{"name":"Marco Zeppieri"},{"name":"Stefania Marsili"},{"name":"E. Enaholo"},{"name":"A. Shuaibu"},{"name":"Ngozi Uwagboe"},{"name":"C. Salati"},{"name":"Leopoldo Spadea"},{"name":"M. Musa"}],"abstract":"Medical imaging is the mainstay of clinical diagnosis and management. Optical coherence tomography (OCT) is a non-invasive imaging technology that has revolutionized the field of ophthalmology. Since its introduction, OCT has undergone significant improvements in image quality, speed, and resolution, making it an essential diagnostic tool for various ocular pathologies. OCT has not only improved the diagnosis and management of ocular diseases but has also found applications in other fields of medicine. In this manuscript, we provide a brief overview of the history of OCT, its current uses and diagnostic capabilities to assess the posterior segment of the eye, and the evolution of this technology from time-domain (TD) to spectral-domain (SD) and swept-source (SS). This brief review will also discuss the limitations, advantages, disadvantages, and future perspectives of this technology in the field of ophthalmology.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.3390/medicina59122114","url":"https://www.semanticscholar.org/paper/2054fbdd087092b54a4b7b1f3cb07d467ab1b0d5","pdf_url":"https://www.mdpi.com/1648-9144/59/12/2114/pdf?version=1701591297","is_open_access":true,"citations":88,"published_at":"","score":69.64},{"id":"ss_2cb4f403104831abb7bbea43067ed7420e4b4b4c","title":"Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards human-level medical question answering","authors":[{"name":"F. Antaki"},{"name":"D. Milad"},{"name":"Mark A. Chia"},{"name":"Charles-Édouard Giguère"},{"name":"Samir Touma"},{"name":"J. El-Khoury"},{"name":"P. Keane"},{"name":"R. Duval"}],"abstract":"Background Evidence on the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model (LLM), in the ophthalmology question-answering domain is needed. Methods We tested GPT-4 on two 260-question multiple choice question sets from the Basic and Clinical Science Course (BCSC) Self-Assessment Program and the OphthoQuestions question banks. We compared the accuracy of GPT-4 models with varying temperatures (creativity setting) and evaluated their responses in a subset of questions. We also compared the best-performing GPT-4 model to GPT-3.5 and to historical human performance. Results GPT-4–0.3 (GPT-4 with a temperature of 0.3) achieved the highest accuracy among GPT-4 models, with 75.8% on the BCSC set and 70.0% on the OphthoQuestions set. The combined accuracy was 72.9%, which represents an 18.3% raw improvement in accuracy compared with GPT-3.5 (p\u003c0.001). Human graders preferred responses from models with a temperature higher than 0 (more creative). Exam section, question difficulty and cognitive level were all predictive of GPT-4-0.3 answer accuracy. GPT-4-0.3’s performance was numerically superior to human performance on the BCSC (75.8% vs 73.3%) and OphthoQuestions (70.0% vs 63.0%), but the difference was not statistically significant (p=0.55 and p=0.09). Conclusion GPT-4, an LLM trained on non-ophthalmology-specific data, performs significantly better than its predecessor on simulated ophthalmology board-style exams. Remarkably, its performance tended to be superior to historical human performance, but that difference was not statistically significant in our study.","source":"Semantic Scholar","year":2023,"language":"en","subjects":["Medicine"],"doi":"10.1136/bjo-2023-324438","url":"https://www.semanticscholar.org/paper/2cb4f403104831abb7bbea43067ed7420e4b4b4c","is_open_access":true,"citations":80,"published_at":"","score":69.4},{"id":"ss_62e0e0d0235e0fab45a3a308770e72d0ec029baf","title":"Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge","authors":[{"name":"D. Bahir"},{"name":"O. Zur"},{"name":"L. Attal"},{"name":"Z. Nujeidat"},{"name":"Ariela Knaanie"},{"name":"Joseph Pikkel"},{"name":"Michael Mimouni"},{"name":"G. Plopsky"}],"abstract":"","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1007/s00417-024-06625-4","url":"https://www.semanticscholar.org/paper/62e0e0d0235e0fab45a3a308770e72d0ec029baf","is_open_access":true,"citations":45,"published_at":"","score":69.35},{"id":"ss_8c0e6cada1b0093eb9282b8b70a6deb7e3f42e1a","title":"Assessing the medical reasoning skills of GPT-4 in complex ophthalmology cases","authors":[{"name":"D. Milad"},{"name":"F. Antaki"},{"name":"Jason Milad"},{"name":"Andrew Farah"},{"name":"T. Khairy"},{"name":"D. Mikhail"},{"name":"Charles-Édouard Giguère"},{"name":"Samir Touma"},{"name":"Allison Bernstein"},{"name":"Andrei-Alexandru Szigiato"},{"name":"Taylor Nayman"},{"name":"G. Mullie"},{"name":"R. Duval"}],"abstract":"Background/aims This study assesses the proficiency of Generative Pre-trained Transformer (GPT)-4 in answering questions about complex clinical ophthalmology cases. Methods We tested GPT-4 on 422 Journal of the American Medical Association Ophthalmology Clinical Challenges, and prompted the model to determine the diagnosis (open-ended question) and identify the next-step (multiple-choice question). We generated responses using two zero-shot prompting strategies, including zero-shot plan-and-solve+ (PS+), to improve the reasoning of the model. We compared the best-performing model to human graders in a benchmarking effort. Results Using PS+ prompting, GPT-4 achieved mean accuracies of 48.0% (95% CI (43.1% to 52.9%)) and 63.0% (95% CI (58.2% to 67.6%)) in diagnosis and next step, respectively. Next-step accuracy did not significantly differ by subspecialty (p=0.44). However, diagnostic accuracy in pathology and tumours was significantly higher than in uveitis (p=0.027). When the diagnosis was accurate, 75.2% (95% CI (68.6% to 80.9%)) of the next steps were correct. Conversely, when the diagnosis was incorrect, 50.2% (95% CI (43.8% to 56.6%)) of the next steps were accurate. The next step was three times more likely to be accurate when the initial diagnosis was correct (p\u003c0.001). No significant differences were observed in diagnostic accuracy and decision-making between board-certified ophthalmologists and GPT-4. Among trainees, senior residents outperformed GPT-4 in diagnostic accuracy (p≤0.001 and 0.049) and in accuracy of next step (p=0.002 and 0.020). Conclusion Improved prompting enhances GPT-4’s performance in complex clinical situations, although it does not surpass ophthalmology trainees in our context. Specialised large language models hold promise for future assistance in medical decision-making and diagnosis.","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1136/bjo-2023-325053","url":"https://www.semanticscholar.org/paper/8c0e6cada1b0093eb9282b8b70a6deb7e3f42e1a","pdf_url":"https://discovery.ucl.ac.uk/10189698/1/Antaki_ChatGPT%20JAMA%20Opht%20revision-final%20clean.pdf","is_open_access":true,"citations":45,"published_at":"","score":69.35},{"id":"ss_c20188e14c7534a0240dd3d9dc453bc35012ffe2","title":"Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology","authors":[{"name":"Ming-Jie Luo"},{"name":"Jianyu Pang"},{"name":"Shaowei Bi"},{"name":"Yunxi Lai"},{"name":"Jiaman Zhao"},{"name":"Yuanrui Shang"},{"name":"Tingxin Cui"},{"name":"Yahan Yang"},{"name":"Zhenzhe Lin"},{"name":"Lanqin Zhao"},{"name":"Xiaohang Wu"},{"name":"Duoru Lin"},{"name":"Jingjing Chen"},{"name":"Haotian Lin"}],"abstract":"This quality improvement study discusses the challenges of knowledge inaccuracies and data privacy issues when using large language models in ophthalmology and how to overcome them.","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1001/jamaophthalmol.2024.2513","url":"https://www.semanticscholar.org/paper/c20188e14c7534a0240dd3d9dc453bc35012ffe2","pdf_url":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11258636","is_open_access":true,"citations":41,"published_at":"","score":69.22999999999999},{"id":"ss_cba7ce74fd7ec9c2121c1abb830ef6bfa39fe247","title":"Generative artificial intelligence in ophthalmology.","authors":[{"name":"E. Waisberg"},{"name":"J. Ong"},{"name":"S. Kamran"},{"name":"M. Masalkhi"},{"name":"Phani Paladugu"},{"name":"N. Zaman"},{"name":"Andrew G Lee"},{"name":"Alireza Tavakkoli"}],"abstract":"Generative AI has revolutionized medicine over the past several years. A generative adversarial network (GAN) is a deep learning framework that has become a powerful technique in medicine, particularly in ophthalmology and image analysis. In this paper we review the current ophthalmic literature involving GANs, and highlight key contributions in the field. We briefly touch on ChatGPT, another application of generative AI, and its potential in ophthalmology. We also explore the potential uses for GANs in ocular imaging, with a specific emphasis on 3 primary domains: image enhancement, disease identification, and generating of synthetic data. PubMed, Ovid MEDLINE, Google Scholar were searched from inception to October 30, 2022 to identify applications of GAN in ophthalmology. A total of 40 papers were included in this review. We cover various applications of GANs in ophthalmic-related imaging including optical coherence tomography, orbital magnetic resonance imaging, fundus photography, and ultrasound; however, we also highlight several challenges, that resulted in the generation of inaccurate and atypical results during certain iterations. Finally, we examine future directions and considerations for generative AI in ophthalmology.","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1016/j.survophthal.2024.04.009","url":"https://www.semanticscholar.org/paper/cba7ce74fd7ec9c2121c1abb830ef6bfa39fe247","is_open_access":true,"citations":38,"published_at":"","score":69.14},{"id":"ss_fde04bbf5106a26d2517854b2d359f7f98d4be54","title":"Foundation models in ophthalmology","authors":[{"name":"Mark A. Chia"},{"name":"F. Antaki"},{"name":"Yukun Zhou"},{"name":"A. Turner"},{"name":"Aaron Y Lee"},{"name":"P. Keane"}],"abstract":"Foundation models represent a paradigm shift in artificial intelligence (AI), evolving from narrow models designed for specific tasks to versatile, generalisable models adaptable to a myriad of diverse applications. Ophthalmology as a specialty has the potential to act as an exemplar for other medical specialties, offering a blueprint for integrating foundation models broadly into clinical practice. This review hopes to serve as a roadmap for eyecare professionals seeking to better understand foundation models, while equipping readers with the tools to explore the use of foundation models in their own research and practice. We begin by outlining the key concepts and technological advances which have enabled the development of these models, providing an overview of novel training approaches and modern AI architectures. Next, we summarise existing literature on the topic of foundation models in ophthalmology, encompassing progress in vision foundation models, large language models and large multimodal models. Finally, we outline major challenges relating to privacy, bias and clinical validation, and propose key steps forward to maximise the benefit of this powerful technology.","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1136/bjo-2024-325459","url":"https://www.semanticscholar.org/paper/fde04bbf5106a26d2517854b2d359f7f98d4be54","pdf_url":"https://bjo.bmj.com/content/bjophthalmol/early/2024/06/04/bjo-2024-325459.full.pdf","is_open_access":true,"citations":35,"published_at":"","score":69.05},{"id":"ss_727b1e595d53bacdafc780ab22a5776fa341860a","title":"Diagnostic capabilities of ChatGPT in ophthalmology","authors":[{"name":"A. Shemer"},{"name":"Michael N. Cohen"},{"name":"Aya Altarescu"},{"name":"M. Atar-Vardi"},{"name":"Idan Hecht"},{"name":"B. Dubinsky-Pertzov"},{"name":"Nadav Shoshany"},{"name":"Sigal Zmujack"},{"name":"L. Or"},{"name":"A. Einan-Lifshitz"},{"name":"E. Pras"}],"abstract":"","source":"Semantic Scholar","year":2024,"language":"en","subjects":["Medicine"],"doi":"10.1007/s00417-023-06363-z","url":"https://www.semanticscholar.org/paper/727b1e595d53bacdafc780ab22a5776fa341860a","is_open_access":true,"citations":34,"published_at":"","score":69.02000000000001},{"id":"doaj_10.1186/s12874-025-02478-5","title":"Construction of the cancer patients’ database based on the US National Health and Nutrition Examination Survey (NHANES) datasets for cancer epidemiology research","authors":[{"name":"Jinyoung Moon"},{"name":"Yongseok Mun"}],"abstract":"Abstract Background The US National Health and Nutrition Examination Survey (NHANES) dataset does not include a specific question or laboratory test to confirm a history of cancer diagnosis. However, if straightforward variables for cancer history are introduced, US NHANES could be effectively utilized in future cancer epidemiology studies. To address this gap, the authors developed a cancer patient database from the US NHANES datasets by employing multiple R programming codes. Methods To illustrate the practical application of this methodology to a real-world problem, the authors extracted the R codes applied in an academic paper published in another journal on January 30th, 2024 ( https://doi.org/10.1016/j.heliyon.2024.e24337 ). This paper will focus on the construction of the database and analysis using R codes. Entire. Results In the first example, the urine concentration of monocarboxynonyl phthalate, monocarboxyoctyl phthalate, mono-2-ethyl-5-carboxypentyl phthalate, and mono-2-hydroxy-iso-butyl phthalate (all ng/mL) were used as the independent variable, instead of the serum concentration of perfluorooctanoic acid (PFOA), perfluorooctane sulfonic acid (PFOS), perfluorohexane sulfonic acid (PFHxS), and perfluorononanoic acid (PFNA), respectively. In the second example, the serum concentration of 2,3,3’,4,4’-Pentachlorobiphenyl (PCB105), 2,3,4,4´,5-Pentachlorobiphenyl (PCB114), 2,3’,4,4’,5-Pentachlorobiphenyl (PCB118), and 2,2’,3,4,4’,5’- and 2,3,3’,4,4’,6-Hexachlorobiphenyl (PCB138) were used as the independent variable, instead of the serum concentration of PFOA, PFOS, PFHxS, and PFNA, respectively. Discussion This research offers a comprehensive set of R codes aimed at creating a single, user-friendly variable that encapsulates the history of each type of cancer while also considering the age at which the diagnosis was made. The US NHANES provides a wealth of critical data on environmental toxicant exposures. By employing these R codes, researchers can potentially discover numerous new associations between environmental toxicant exposures and cancer diagnoses. Ultimately, these codes could significantly advance the field of cancer epidemiology in relation to environmental toxicant exposure.","source":"DOAJ","year":2025,"language":"","subjects":["Medicine (General)"],"doi":"10.1186/s12874-025-02478-5","url":"https://doi.org/10.1186/s12874-025-02478-5","is_open_access":true,"published_at":"","score":69}],"total":549286,"page":1,"page_size":20,"sources":["DOAJ","arXiv","Semantic Scholar","CrossRef"],"query":"Ophthalmology"}