arXiv Open Access 2021

Risk-Averse Biased Human Policies in Assistive Multi-Armed Bandit Settings

Michael Koller Timothy Patten Markus Vincze

Lihat Sumber

Abstrak

Assistive multi-armed bandit problems can be used to model team situations between a human and an autonomous system like a domestic service robot. To account for human biases such as the risk-aversion described in the Cumulative Prospect Theory, the setting is expanded to using observable rewards. When robots leverage knowledge about the risk-averse human model they eliminate the bias and make more rational choices. We present an algorithm that increases the utility value of such human-robot teams. A brief evaluation indicates that arbitrary reward functions can be handled.

Topik & Kata Kunci

cs.RO

Penulis (3)

Michael Koller

Timothy Patten

Markus Vincze

Format Sitasi

APA MLA BibTeX

Koller, M., Patten, T., Vincze, M. (2021). Risk-Averse Biased Human Policies in Assistive Multi-Armed Bandit Settings. https://arxiv.org/abs/2104.05334

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓