arXiv Open Access 2026

PERM: Psychology-grounded Empathetic Reward Modeling for Large Language Models

Chengbing Wang Wuqiang Zheng Yang Zhang Fengbin Zhu Junyi Cheng +3 lainnya
Lihat Sumber

Abstrak

Large Language Models (LLMs) are increasingly deployed in human-centric applications, yet they often fail to provide substantive emotional support. While Reinforcement Learning (RL) has been utilized to enhance empathy of LLMs, existing reward models typically evaluate empathy from a single perspective, overlooking the inherently bidirectional interaction nature of empathy between the supporter and seeker as defined by Empathy Cycle theory. To address this limitation, we propose Psychology-grounded Empathetic Reward Modeling (PERM). PERM operationalizes empathy evaluation through a bidirectional decomposition: 1) Supporter perspective, assessing internal resonation and communicative expression; 2) Seeker perspective, evaluating emotional reception. Additionally, it incorporates a bystander perspective to monitor overall interaction quality. Extensive experiments on a widely-used emotional intelligence benchmark and an industrial daily conversation dataset demonstrate that PERM outperforms state-of-the-art baselines by over 10\%. Furthermore, a blinded user study reveals a 70\% preference for our approach, highlighting its efficacy in generating more empathetic responses. Our code, dataset, and models are available at https://github.com/ZhengWwwq/PERM.

Topik & Kata Kunci

Penulis (8)

C

Chengbing Wang

W

Wuqiang Zheng

Y

Yang Zhang

F

Fengbin Zhu

J

Junyi Cheng

Y

Yi Xie

W

Wenjie Wang

F

Fuli Feng

Format Sitasi

Wang, C., Zheng, W., Zhang, Y., Zhu, F., Cheng, J., Xie, Y. et al. (2026). PERM: Psychology-grounded Empathetic Reward Modeling for Large Language Models. https://arxiv.org/abs/2601.10532

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓