arXiv Open Access 2025

LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

Junyeong Park Seogyeong Jeong Seyoung Song Yohan Lee Alice Oh

Lihat Sumber

Abstrak

Content moderation is a global challenge, yet major tech platforms prioritize high-resource languages, leaving low-resource languages with scarce native moderators. Since effective moderation depends on understanding contextual cues, this imbalance increases the risk of improper moderation due to non-native moderators' limited cultural understanding. Through a user study, we identify that non-native moderators struggle with interpreting culturally-specific knowledge, sentiment, and internet culture in the hate speech moderation. To assist them, we present LLM-C3MOD, a human-LLM collaborative pipeline with three steps: (1) RAG-enhanced cultural context annotations; (2) initial LLM-based moderation; and (3) targeted human moderation for cases lacking LLM consensus. Evaluated on a Korean hate speech dataset with Indonesian and German participants, our system achieves 78% accuracy (surpassing GPT-4o's 71% baseline), while reducing human workload by 83.6%. Notably, human moderators excel at nuanced contents where LLMs struggle. Our findings suggest that non-native moderators, when properly supported by LLMs, can effectively contribute to cross-cultural hate speech moderation.

Topik & Kata Kunci

cs.CL cs.AI

Penulis (5)

Junyeong Park

Seogyeong Jeong

Seyoung Song

Yohan Lee

Alice Oh

Format Sitasi

APA MLA BibTeX

Park, J., Jeong, S., Song, S., Lee, Y., Oh, A. (2025). LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation. https://arxiv.org/abs/2503.07237

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓