arXiv Open Access 2017

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

Philip S. Thomas Emma Brunskill
Lihat Sumber

Abstrak

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

Topik & Kata Kunci

Penulis (2)

P

Philip S. Thomas

E

Emma Brunskill

Format Sitasi

Thomas, P.S., Brunskill, E. (2017). Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines. https://arxiv.org/abs/1706.06643

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2017
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓