arXiv Open Access 2015

Where To Look: Focus Regions for Visual Question Answering

Kevin J. Shih Saurabh Singh Derek Hoiem

Lihat Sumber

Abstrak

We present a method that learns to answer visual questions by selecting image regions relevant to the text-based query. Our method exhibits significant improvements in answering questions such as "what color," where it is necessary to evaluate a specific location, and "what room," where it selectively identifies informative image regions. Our model is tested on the VQA dataset which is the largest human-annotated visual question answering dataset to our knowledge.

Topik & Kata Kunci

cs.CV

Penulis (3)

Kevin J. Shih

Saurabh Singh

Derek Hoiem

Format Sitasi

APA MLA BibTeX

Shih, K.J., Singh, S., Hoiem, D. (2015). Where To Look: Focus Regions for Visual Question Answering. https://arxiv.org/abs/1511.07394

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2015
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓