arXiv Open Access 2024

Identifying the Best Arm in the Presence of Global Environment Shifts

Phurinut Srisawad Juergen Branke Long Tran-Thanh
Lihat Sumber

Abstrak

This paper formulates a new Best-Arm Identification problem in the non-stationary stochastic bandits setting, where the means of all arms are shifted in the same way due to a global influence of the environment. The aim is to identify the unique best arm across environmental change given a fixed total budget. While this setting can be regarded as a special case of Adversarial Bandits or Corrupted Bandits, we demonstrate that existing solutions tailored to those settings do not fully utilise the nature of this global influence, and thus, do not work well in practice (despite their theoretical guarantees). To overcome this issue, in this paper we develop a novel selection policy that is consistent and robust in dealing with global environmental shifts. We then propose an allocation policy, LinLUCB, which exploits information about global shifts across all arms in each environment. Empirical tests depict a significant improvement in our policies against other existing methods.

Topik & Kata Kunci

Penulis (3)

P

Phurinut Srisawad

J

Juergen Branke

L

Long Tran-Thanh

Format Sitasi

Srisawad, P., Branke, J., Tran-Thanh, L. (2024). Identifying the Best Arm in the Presence of Global Environment Shifts. https://arxiv.org/abs/2408.12581

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓