Semantic Scholar Open Access 2024 158 sitasi

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

Zhaorun Chen Zhaorun Chen Zhuokai Zhao Hongyin Luo Huaxiu Yao +2 lainnya

Abstrak

While large vision-language models (LVLMs) have demonstrated impressive capabilities in interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH). We introduce HALC, a novel decoding algorithm designed to mitigate OH in LVLMs. HALC leverages distinct fine-grained optimal visual information in vision-language tasks and operates on both local and global contexts simultaneously. Specifically, HALC integrates a robust auto-focal grounding mechanism (locally) to correct hallucinated tokens on the fly, and a specialized beam search algorithm (globally) to significantly reduce OH while preserving text generation quality. Additionally, HALC can be integrated into any LVLMs as a plug-and-play module without extra training. Extensive experimental studies demonstrate the effectiveness of HALC in reducing OH, outperforming state-of-the-arts across four benchmarks.

Topik & Kata Kunci

Penulis (7)

Z

Zhaorun Chen

Z

Zhaorun Chen

Z

Zhuokai Zhao

H

Hongyin Luo

H

Huaxiu Yao

B

Bo Li

J

Jiawei Zhou

Format Sitasi

Chen, Z., Chen, Z., Zhao, Z., Luo, H., Yao, H., Li, B. et al. (2024). HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding. https://doi.org/10.48550/arXiv.2403.00425

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.48550/arXiv.2403.00425
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Total Sitasi
158×
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2403.00425
Akses
Open Access ✓