HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs
Abstrak
Existing human-robot interaction systems often lack mechanisms for sustained personalization and dynamic adaptation in multi-user environments, limiting their effectiveness in real-world deployments. We present HARMONI, a multimodal personalization framework that leverages large language models to enable socially assistive robots to manage long-term multi-user interactions. The framework integrates four key modules: (i) a perception module that identifies active speakers and extracts multimodal input; (ii) a world modeling module that maintains representations of the environment and short-term conversational context; (iii) a user modeling module that updates long-term speaker-specific profiles; and (iv) a generation module that produces contextually grounded and ethically informed responses. Through extensive evaluation and ablation studies on four datasets, as well as a real-world scenario-driven user-study in a nursing home environment, we demonstrate that HARMONI supports robust speaker identification, online memory updating, and ethically aligned personalization, outperforming baseline LLM-driven approaches in user modeling accuracy, personalization quality, and user satisfaction.
Penulis (8)
Jeanne Malécot
Hamed Rahimi
Jeanne Cattoni
Marie Samson
Mouad Abrini
Mahdi Khoramshahi
Maribel Pino
Mohamed Chetouani
Akses Cepat
- Tahun Terbit
- 2026
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓