arXiv
Open Access
2025
Performance of Large Language Models in Answering Critical Care Medicine Questions
Mahmoud Alwakeel
Aditya Nagori
An-Kwok Ian Wong
Neal Chaisson
Vijay Krishnamoorthy
+1 lainnya
Abstrak
Large Language Models have been tested on medical student-level questions, but their performance in specialized fields like Critical Care Medicine (CCM) is less explored. This study evaluated Meta-Llama 3.1 models (8B and 70B parameters) on 871 CCM questions. Llama3.1:70B outperformed 8B by 30%, with 60% average accuracy. Performance varied across domains, highest in Research (68.4%) and lowest in Renal (47.9%), highlighting the need for broader future work to improve models across various subspecialty domains.
Topik & Kata Kunci
Penulis (6)
M
Mahmoud Alwakeel
A
Aditya Nagori
A
An-Kwok Ian Wong
N
Neal Chaisson
V
Vijay Krishnamoorthy
R
Rishikesan Kamaleswaran
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2025
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓