DOAJ Open Access 2026

Comparative Evaluation of Vision–Language Models for Detecting and Localizing Dental Lesions from Intraoral Images

Maria Jahan Al Ibne Siam Lamim Zakir Pronay Saif Ahmed Nabeel Mohammed +2 lainnya

Abstrak

To assess the efficiency of vision–language models in detecting and classifying carious and non-carious lesions from intraoral photo imaging. A dataset of 172 annotated images were classified for microcavitation, cavitated lesions, staining, calculus, and non-carious lesions. Florence-2, PaLI-Gemma, and YOLOv8 models were trained on the dataset and model performance. The dataset was divided into 80:10:10 split, and the model performance was evaluated using mean average precision (mAP), mAP50-95, class-specific precision and recall. YOLOv8 outperformed the vision–language models, achieving a mean average precision (mAP) of 37% with a precision of 42.3% (with 100% for cavitation detection) and 31.3% recall. PaLI-Gemma produced a recall of 13% and 21%. Florence-2 yielded a mean average precision of 10% with a precision and recall was 51% and 35%. YOLOv8 achieved the strongest overall performance. Florence-2 and PaLI-Gemma models underperformed relative to YOLOv8 despite the potential for multimodal contextual understanding, highlighting the need for larger, more diverse datasets and hybrid architectures to achieve improved performance.

Penulis (7)

M

Maria Jahan

A

Al Ibne Siam

L

Lamim Zakir Pronay

S

Saif Ahmed

N

Nabeel Mohammed

J

James Dudley

T

Taseef Hasan Farook

Format Sitasi

Jahan, M., Siam, A.I., Pronay, L.Z., Ahmed, S., Mohammed, N., Dudley, J. et al. (2026). Comparative Evaluation of Vision–Language Models for Detecting and Localizing Dental Lesions from Intraoral Images. https://doi.org/10.3390/jimaging12010022

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/jimaging12010022
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.3390/jimaging12010022
Akses
Open Access ✓