arXiv Open Access 2025

Social Genome: Grounded Social Reasoning Abilities of Multimodal Models

Leena Mathur Marian Qian Paul Pu Liang Louis-Philippe Morency
Lihat Sumber

Abstrak

Social reasoning abilities are crucial for AI systems to effectively interpret and respond to multimodal human communication and interaction within social contexts. We introduce SOCIAL GENOME, the first benchmark for fine-grained, grounded social reasoning abilities of multimodal models. SOCIAL GENOME contains 272 videos of interactions and 1,486 human-annotated reasoning traces related to inferences about these interactions. These traces contain 5,777 reasoning steps that reference evidence from visual cues, verbal cues, vocal cues, and external knowledge (contextual knowledge external to videos). SOCIAL GENOME is also the first modeling challenge to study external knowledge in social reasoning. SOCIAL GENOME computes metrics to holistically evaluate semantic and structural qualities of model-generated social reasoning traces. We demonstrate the utility of SOCIAL GENOME through experiments with state-of-the-art models, identifying performance gaps and opportunities for future research to improve the grounded social reasoning abilities of multimodal models.

Topik & Kata Kunci

Penulis (4)

L

Leena Mathur

M

Marian Qian

P

Paul Pu Liang

L

Louis-Philippe Morency

Format Sitasi

Mathur, L., Qian, M., Liang, P.P., Morency, L. (2025). Social Genome: Grounded Social Reasoning Abilities of Multimodal Models. https://arxiv.org/abs/2502.15109

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓