arXiv Open Access 2022

PG3: Policy-Guided Planning for Generalized Policy Generation

Ryan Yang Tom Silver Aidan Curtis Tomas Lozano-Perez Leslie Pack Kaelbling
Lihat Sumber

Abstrak

A longstanding objective in classical planning is to synthesize policies that generalize across multiple problems from the same domain. In this work, we study generalized policy search-based methods with a focus on the score function used to guide the search over policies. We demonstrate limitations of two score functions and propose a new approach that overcomes these limitations. The main idea behind our approach, Policy-Guided Planning for Generalized Policy Generation (PG3), is that a candidate policy should be used to guide planning on training problems as a mechanism for evaluating that candidate. Theoretical results in a simplified setting give conditions under which PG3 is optimal or admissible. We then study a specific instantiation of policy search where planning problems are PDDL-based and policies are lifted decision lists. Empirical results in six domains confirm that PG3 learns generalized policies more efficiently and effectively than several baselines. Code: https://github.com/ryangpeixu/pg3

Topik & Kata Kunci

Penulis (5)

R

Ryan Yang

T

Tom Silver

A

Aidan Curtis

T

Tomas Lozano-Perez

L

Leslie Pack Kaelbling

Format Sitasi

Yang, R., Silver, T., Curtis, A., Lozano-Perez, T., Kaelbling, L.P. (2022). PG3: Policy-Guided Planning for Generalized Policy Generation. https://arxiv.org/abs/2204.10420

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓