arXiv Open Access 2025

Multi-Task Reward Learning from Human Ratings

Mingkang Wu Devin White Evelyn Rose Vernon Lawhern Nicholas R Waytowich +1 lainnya

Lihat Sumber

Abstrak

Reinforcement learning from human feedback (RLHF) has become a key factor in aligning model behavior with users' goals. However, while humans integrate multiple strategies when making decisions, current RLHF approaches often simplify this process by modeling human reasoning through isolated tasks such as classification or regression. In this paper, we propose a novel reinforcement learning (RL) method that mimics human decision-making by jointly considering multiple tasks. Specifically, we leverage human ratings in reward-free environments to infer a reward function, introducing learnable weights that balance the contributions of both classification and regression models. This design captures the inherent uncertainty in human decision-making and allows the model to adaptively emphasize different strategies. We conduct several experiments using synthetic human ratings to validate the effectiveness of the proposed approach. Results show that our method consistently outperforms existing rating-based RL methods, and in some cases, even surpasses traditional RL approaches.

Topik & Kata Kunci

cs.LG cs.AI

Penulis (6)

Mingkang Wu

Devin White

Evelyn Rose

Vernon Lawhern

Nicholas R Waytowich

Yongcan Cao

Format Sitasi

APA MLA BibTeX

Wu, M., White, D., Rose, E., Lawhern, V., Waytowich, N.R., Cao, Y. (2025). Multi-Task Reward Learning from Human Ratings. https://arxiv.org/abs/2506.09183

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓