arXiv Open Access 2025

Vision Language Models in Medicine

Beria Chingnabe Kalpelbe Angel Gabriel Adaambiik Wei Peng

Lihat Sumber

Abstrak

With the advent of Vision-Language Models (VLMs), medical artificial intelligence (AI) has experienced significant technological progress and paradigm shifts. This survey provides an extensive review of recent advancements in Medical Vision-Language Models (Med-VLMs), which integrate visual and textual data to enhance healthcare outcomes. We discuss the foundational technology behind Med-VLMs, illustrating how general models are adapted for complex medical tasks, and examine their applications in healthcare. The transformative impact of Med-VLMs on clinical practice, education, and patient care is highlighted, alongside challenges such as data scarcity, narrow task generalization, interpretability issues, and ethical concerns like fairness, accountability, and privacy. These limitations are exacerbated by uneven dataset distribution, computational demands, and regulatory hurdles. Rigorous evaluation methods and robust regulatory frameworks are essential for safe integration into healthcare workflows. Future directions include leveraging large-scale, diverse datasets, improving cross-modal generalization, and enhancing interpretability. Innovations like federated learning, lightweight architectures, and Electronic Health Record (EHR) integration are explored as pathways to democratize access and improve clinical relevance. This review aims to provide a comprehensive understanding of Med-VLMs' strengths and limitations, fostering their ethical and balanced adoption in healthcare.

Topik & Kata Kunci

cs.CV cs.AI cs.CL cs.CY eess.IV

Penulis (3)

Beria Chingnabe Kalpelbe

Angel Gabriel Adaambiik

Wei Peng

Format Sitasi

APA MLA BibTeX

Kalpelbe, B.C., Adaambiik, A.G., Peng, W. (2025). Vision Language Models in Medicine. https://arxiv.org/abs/2503.01863

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓