DOAJ Open Access 2025

Fine-Tuning Methods for Large Language Models in Clinical Medicine by Supervised Fine-Tuning and Direct Preference Optimization: Comparative Evaluation

Thomas Savage Stephen P Ma Abdessalem Boukil Ekanath Rangan Vishwesh Patel +2 lainnya

Abstrak

Abstract BackgroundLarge language model (LLM) fine-tuning is the process of adjusting out-of-the-box model weights using a dataset of interest. Fine-tuning can be a powerful technique to improve model performance in fields like medicine, where LLMs may have poor out-of-the-box performance. The 2 common fine-tuning techniques are supervised fine-tuning (SFT) and direct preference optimization (DPO); however, little guidance is available for when to apply either method within clinical medicine or health care operations. ObjectiveThis study aims to investigate the benefits of fine-tuning with SFT and DPO across a range of core natural language tasks in medicine to better inform clinical informaticists when either technique should be deployed. MethodsWe use Llama3 8B (Meta) and Mistral 7B v2 (Mistral AI) to compare the performance of SFT alone and DPO across 4 common natural language tasks in medicine. The tasks we evaluate include text classification, clinical reasoning, text summarization, and clinical triage. ResultsOur results found clinical reasoning accuracy increased from 7% to 22% with base Llama3 and Mistral2, respectively, to 28% and 33% with SFT, and then 36% and 40% with DPO (PPPF1PF1PPF1PP ConclusionsSFT alone is sufficient for simple tasks such as rule-based text classification, while DPO after SFT improves performance on the more complex tasks of triage, clinical reasoning, and summarization. We postulate that SFT alone is sufficient for simple tasks because SFT strengthens simple word-association reasoning, whereas DPO enables deeper comprehension because it is trained with both positive and negative examples, enabling the model to recognize more complex patterns. Ultimately, our results help inform clinical informaticists when to deploy either fine-tuning method and encourage commercial LLM providers to offer DPO fine-tuning for commonly used proprietary LLMs in medicine.

Topik & Kata Kunci

Computer applications to medicine. Medical informatics Public aspects of medicine

Penulis (7)

Thomas Savage

Stephen P Ma

Abdessalem Boukil

Ekanath Rangan

Vishwesh Patel

Ivan Lopez

Jonathan Chen

Format Sitasi

APA MLA BibTeX

Savage, T., Ma, S.P., Boukil, A., Rangan, E., Patel, V., Lopez, I. et al. (2025). Fine-Tuning Methods for Large Language Models in Clinical Medicine by Supervised Fine-Tuning and Direct Preference Optimization: Comparative Evaluation. https://doi.org/10.2196/76048

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.2196/76048

Informasi Jurnal

Tahun Terbit: 2025
Sumber Database: DOAJ
DOI: 10.2196/76048
Akses: Open Access ✓