arXiv Open Access 2026

LVLMs and Humans Ground Differently in Referential Communication

Peter Zeng Weiling Li Amie Paige Zhengxiang Wang Panagiotis Kaliosis +4 lainnya

Lihat Sumber

Abstrak

For generative AI agents to partner effectively with human users, the ability to accurately predict human intent is critical. But this ability to collaborate remains limited by a critical deficit: an inability to model common ground. Here, we present a referential communication experiment with a factorial design involving director-matcher pairs (human-human, human-AI, AI-human, and AI-AI) that interact with multiple turns in repeated rounds to match pictures of objects not associated with any obvious lexicalized labels. We release the online pipeline for data collection, the tools and analyses for accuracy, efficiency, and lexical overlap, and a corpus of 356 dialogues (89 pairs over 4 rounds each) that unmasks LVLMs' limitations in interactively resolving referring expressions, a crucial skill that underlies human language use.

Topik & Kata Kunci

cs.CL cs.AI cs.HC

Penulis (9)

Peter Zeng

Weiling Li

Amie Paige

Zhengxiang Wang

Panagiotis Kaliosis

Dimitris Samaras

Gregory Zelinsky

Susan Brennan

Owen Rambow

Format Sitasi

APA MLA BibTeX

Zeng, P., Li, W., Paige, A., Wang, Z., Kaliosis, P., Samaras, D. et al. (2026). LVLMs and Humans Ground Differently in Referential Communication. https://arxiv.org/abs/2601.19792

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓