arXiv Open Access 2025

Visual Language Models show widespread visual deficits on neuropsychological tests

Gene Tangtartharakul Katherine R. Storrs

Lihat Sumber

Abstrak

Visual Language Models (VLMs) show remarkable performance in visual reasoning tasks, successfully tackling college-level challenges that require high-level understanding of images. However, some recent reports of VLMs struggling to reason about elemental visual concepts like orientation, position, continuity, and occlusion suggest a potential gulf between human and VLM vision. Here we use the toolkit of neuropsychology to systematically assess the capabilities of three state-of-the-art VLMs across visual domains. Using 51 tests drawn from six clinical and experimental batteries, we characterise the visual abilities of leading VLMs relative to normative performance in healthy adults. While the models excel in straightforward object recognition tasks, we find widespread deficits in low- and mid-level visual abilities that would be considered clinically significant in humans. These selective deficits, profiled through validated test batteries, suggest that an artificial system can achieve complex object recognition without developing foundational visual concepts that in humans require no explicit training.

Topik & Kata Kunci

cs.CV cs.AI cs.LG

Penulis (2)

Gene Tangtartharakul

Katherine R. Storrs

Format Sitasi

APA MLA BibTeX

Tangtartharakul, G., Storrs, K.R. (2025). Visual Language Models show widespread visual deficits on neuropsychological tests. https://arxiv.org/abs/2504.10786

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓