arXiv Open Access 2017

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

Philip S. Thomas Emma Brunskill

Lihat Sumber

Abstrak

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

Topik & Kata Kunci

cs.AI cs.LG

Penulis (2)

Philip S. Thomas

Emma Brunskill

Format Sitasi

APA MLA BibTeX

Thomas, P.S., Brunskill, E. (2017). Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines. https://arxiv.org/abs/1706.06643

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2017
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓