arXiv Open Access 2025

MedOmni-45°: A Safety-Performance Benchmark for Reasoning-Oriented LLMs in Medicine

Kaiyuan Ji Yijin Guo Zicheng Zhang Xiangyang Zhu Yuan Tian +2 lainnya
Lihat Sumber

Abstrak

With the increasing use of large language models (LLMs) in medical decision-support, it is essential to evaluate not only their final answers but also the reliability of their reasoning. Two key risks are Chain-of-Thought (CoT) faithfulness -- whether reasoning aligns with responses and medical facts -- and sycophancy, where models follow misleading cues over correctness. Existing benchmarks often collapse such vulnerabilities into single accuracy scores. To address this, we introduce MedOmni-45 Degrees, a benchmark and workflow designed to quantify safety-performance trade-offs under manipulative hint conditions. It contains 1,804 reasoning-focused medical questions across six specialties and three task types, including 500 from MedMCQA. Each question is paired with seven manipulative hint types and a no-hint baseline, producing about 27K inputs. We evaluate seven LLMs spanning open- vs. closed-source, general-purpose vs. medical, and base vs. reasoning-enhanced models, totaling over 189K inferences. Three metrics -- Accuracy, CoT-Faithfulness, and Anti-Sycophancy -- are combined into a composite score visualized with a 45 Degrees plot. Results show a consistent safety-performance trade-off, with no model surpassing the diagonal. The open-source QwQ-32B performs closest (43.81 Degrees), balancing safety and accuracy but not leading in both. MedOmni-45 Degrees thus provides a focused benchmark for exposing reasoning vulnerabilities in medical LLMs and guiding safer model development.

Topik & Kata Kunci

Penulis (7)

K

Kaiyuan Ji

Y

Yijin Guo

Z

Zicheng Zhang

X

Xiangyang Zhu

Y

Yuan Tian

N

Ning Liu

G

Guangtao Zhai

Format Sitasi

Ji, K., Guo, Y., Zhang, Z., Zhu, X., Tian, Y., Liu, N. et al. (2025). MedOmni-45°: A Safety-Performance Benchmark for Reasoning-Oriented LLMs in Medicine. https://arxiv.org/abs/2508.16213

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓