arXiv Open Access 2025

Learning and Improving Backgammon Strategy

Gregory R. Galperin

Lihat Sumber

Abstrak

A novel approach to learning is presented, combining features of on-line and off-line methods to achieve considerable performance in the task of learning a backgammon value function in a process that exploits the processing power of parallel supercomputers. The off-line methods comprise a set of techniques for parallelizing neural network training and $TD(λ)$ reinforcement learning; here Monte-Carlo ``Rollouts'' are introduced as a massively parallel on-line policy improvement technique which applies resources to the decision points encountered during the search of the game tree to further augment the learned value function estimate. A level of play roughly as good as, or possibly better than, the current champion human and computer backgammon players has been achieved in a short period of learning.

Topik & Kata Kunci

cs.LG cs.AI cs.NE

Penulis (1)

Gregory R. Galperin

Format Sitasi

APA MLA BibTeX

Galperin, G.R. (2025). Learning and Improving Backgammon Strategy. https://arxiv.org/abs/2504.02221

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓