arXiv Open Access 2025

Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches

Remo Sasso Michelangelo Conserva Dominik Jeurissen Paulo Rauber
Lihat Sumber

Abstrak

Exploration in reinforcement learning (RL) remains challenging, particularly in sparse-reward settings. While foundation models possess strong semantic priors, their capabilities as zero-shot exploration agents in classic RL benchmarks are not well understood. We benchmark LLMs and VLMs on multi-armed bandits, Gridworlds, and sparse-reward Atari to test zero-shot exploration. Our investigation reveals a key limitation: while VLMs can infer high-level objectives from visual input, they consistently fail at precise low-level control: the "knowing-doing gap". To analyze a potential bridge for this gap, we investigate a simple on-policy hybrid framework in a controlled, best-case scenario. Our results in this idealized setting show that VLM guidance can significantly improve early-stage sample efficiency, providing a clear analysis of the potential and constraints of using foundation models to guide exploration rather than for end-to-end control.

Topik & Kata Kunci

Penulis (4)

R

Remo Sasso

M

Michelangelo Conserva

D

Dominik Jeurissen

P

Paulo Rauber

Format Sitasi

Sasso, R., Conserva, M., Jeurissen, D., Rauber, P. (2025). Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches. https://arxiv.org/abs/2509.19924

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓