arXiv Open Access 2025

ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models

Zifu Wan Ce Zhang Silong Yong Martin Q. Ma Simon Stepputtis +4 lainnya
Lihat Sumber

Abstrak

Recent Large Vision-Language Models (LVLMs) have introduced a new paradigm for understanding and reasoning about image input through textual responses. Although they have achieved remarkable performance across a range of multi-modal tasks, they face the persistent challenge of hallucination, which introduces practical weaknesses and raises concerns about their reliable deployment in real-world applications. Existing work has explored contrastive decoding approaches to mitigate this issue, where the output of the original LVLM is compared and contrasted with that of a perturbed version. However, these methods require two or more queries that slow down LVLM response generation, making them less suitable for real-time applications. To overcome this limitation, we propose ONLY, a training-free decoding approach that requires only a single query and a one-layer intervention during decoding, enabling efficient real-time deployment. Specifically, we enhance textual outputs by selectively amplifying crucial textual information using a text-to-visual entropy ratio for each token. Extensive experimental results demonstrate that our proposed ONLY consistently outperforms state-of-the-art methods across various benchmarks while requiring minimal implementation effort and computational cost. Code is available at https://github.com/zifuwan/ONLY.

Topik & Kata Kunci

Penulis (9)

Z

Zifu Wan

C

Ce Zhang

S

Silong Yong

M

Martin Q. Ma

S

Simon Stepputtis

L

Louis-Philippe Morency

D

Deva Ramanan

K

Katia Sycara

Y

Yaqi Xie

Format Sitasi

Wan, Z., Zhang, C., Yong, S., Ma, M.Q., Stepputtis, S., Morency, L. et al. (2025). ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models. https://arxiv.org/abs/2507.00898

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓