arXiv Open Access 2025

Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches

Remo Sasso Michelangelo Conserva Dominik Jeurissen Paulo Rauber

Lihat Sumber

Abstrak

Exploration in reinforcement learning (RL) remains challenging, particularly in sparse-reward settings. While foundation models possess strong semantic priors, their capabilities as zero-shot exploration agents in classic RL benchmarks are not well understood. We benchmark LLMs and VLMs on multi-armed bandits, Gridworlds, and sparse-reward Atari to test zero-shot exploration. Our investigation reveals a key limitation: while VLMs can infer high-level objectives from visual input, they consistently fail at precise low-level control: the "knowing-doing gap". To analyze a potential bridge for this gap, we investigate a simple on-policy hybrid framework in a controlled, best-case scenario. Our results in this idealized setting show that VLM guidance can significantly improve early-stage sample efficiency, providing a clear analysis of the potential and constraints of using foundation models to guide exploration rather than for end-to-end control.

Topik & Kata Kunci

cs.LG cs.AI

Penulis (4)

Remo Sasso

Michelangelo Conserva

Dominik Jeurissen

Paulo Rauber

Format Sitasi

APA MLA BibTeX

Sasso, R., Conserva, M., Jeurissen, D., Rauber, P. (2025). Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches. https://arxiv.org/abs/2509.19924

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓