Semantic Scholar Open Access 2021 12 sitasi

Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference

Xiaocong Chen Lina Yao Xianzhi Wang Aixin Sun Wenjie Zhang +1 lainnya

Lihat Sumber DOI

Abstrak

Recent advances in reinforcement learning have inspired increasing interest in learning user modeling adaptively through dynamic interactions, e.g., in reinforcement learning based recommender systems. In most reinforcement learning applications, reward functions provide the critical guideline for optimization. However, current reinforcement learning-based methods rely on manually-defined reward functions, which cannot adapt to dynamic, noisy environments. Moreover, they generally use task-specific reward functions that sacrifice generalization ability. We propose a generative inverse reinforcement learning for user behavioral preference modeling to address the above issues. Instead of using predefined reward functions, our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN. Our model provides a general approach to characterizing and explaining underlying behavioral tendencies. Our experiments show our method outperforms state-of-the-art methods in several scenarios, namely traffic signal control, online recommender systems, and scanpath prediction.

Topik & Kata Kunci

Computer Science

Penulis (6)

Xiaocong Chen

Lina Yao

Xianzhi Wang

Aixin Sun

Wenjie Zhang

Quan Z. Sheng

Format Sitasi

APA MLA BibTeX

Chen, X., Yao, L., Wang, X., Sun, A., Zhang, W., Sheng, Q.Z. (2021). Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference. https://doi.org/10.1109/TKDE.2022.3186920

Akses Cepat

Lihat di Sumber doi.org/10.1109/TKDE.2022.3186920

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Total Sitasi: 12×
Sumber Database: Semantic Scholar
DOI: 10.1109/TKDE.2022.3186920
Akses: Open Access ✓