arXiv Open Access 2025

MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation

Weihua Zheng Zhengyuan Liu Tanmoy Chakraborty Weiwen Xu Xiaoxue Gao +30 lainnya
Lihat Sumber

Abstrak

Large language models (LLMs) are now used worldwide, yet their multimodal understanding and reasoning often degrade outside Western, high-resource settings. We propose MMA-ASIA, a comprehensive framework to evaluate LLMs' cultural awareness with a focus on Asian contexts. MMA-ASIA centers on a human-curated, multilingual, and multimodally aligned multiple-choice benchmark covering 8 Asian countries and 10 languages, comprising 27,000 questions; over 79 percent require multi-step reasoning grounded in cultural context, moving beyond simple memorization. To our knowledge, this is the first dataset aligned at the input level across three modalities: text, image (visual question answering), and speech. This enables direct tests of cross-modal transfer. Building on this benchmark, we propose a five-dimensional evaluation protocol that measures: (i) cultural-awareness disparities across countries, (ii) cross-lingual consistency, (iii) cross-modal consistency, (iv) cultural knowledge generalization, and (v) grounding validity. To ensure rigorous assessment, a Cultural Awareness Grounding Validation Module detects "shortcut learning" by checking whether the requisite cultural knowledge supports correct answers. Finally, through comparative model analysis, attention tracing, and an innovative Vision-ablated Prefix Replay (VPR) method, we probe why models diverge across languages and modalities, offering actionable insights for building culturally reliable multimodal LLMs.

Topik & Kata Kunci

Penulis (35)

W

Weihua Zheng

Z

Zhengyuan Liu

T

Tanmoy Chakraborty

W

Weiwen Xu

X

Xiaoxue Gao

B

Bryan Chen Zhengyu Tan

B

Bowei Zou

C

Chang Liu

Y

Yujia Hu

X

Xing Xie

X

Xiaoyuan Yi

J

Jing Yao

C

Chaojun Wang

L

Long Li

R

Rui Liu

H

Huiyao Liu

K

Koji Inoue

R

Ryuichi Sumida

T

Tatsuya Kawahara

F

Fan Xu

L

Lingyu Ye

W

Wei Tian

D

Dongjun Kim

J

Jimin Jung

J

Jaehyung Seo

N

Nadya Yuki Wangsajaya

P

Pham Minh Duc

O

Ojasva Saxena

P

Palash Nandi

X

Xiyan Tao

W

Wiwik Karlina

T

Tuan Luong

K

Keertana Arun Vasan

R

Roy Ka-Wei Lee

N

Nancy F. Chen

Format Sitasi

Zheng, W., Liu, Z., Chakraborty, T., Xu, W., Gao, X., Tan, B.C.Z. et al. (2025). MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation. https://arxiv.org/abs/2510.08608

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓