arXiv Open Access 2025

Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English

Runtao Zhou Guangya Wan Saadia Gabriel Sheng Li Alexander J Gates +2 lainnya
Lihat Sumber

Abstrak

Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning tasks, leading to their widespread deployment. However, recent studies have highlighted concerning biases in these models, particularly in their handling of dialectal variations like African American English (AAE). In this work, we systematically investigate dialectal disparities in LLM reasoning tasks. We develop an experimental framework comparing LLM performance given Standard American English (SAE) and AAE prompts, combining LLM-based dialect conversion with established linguistic analyses. We find that LLMs consistently produce less accurate responses and simpler reasoning chains and explanations for AAE inputs compared to equivalent SAE questions, with disparities most pronounced in social science and humanities domains. These findings highlight systematic differences in how LLMs process and reason about different language varieties, raising important questions about the development and deployment of these systems in our multilingual and multidialectal world. Our code repository is publicly available at https://github.com/Runtaozhou/dialect_bias_eval.

Topik & Kata Kunci

Penulis (7)

R

Runtao Zhou

G

Guangya Wan

S

Saadia Gabriel

S

Sheng Li

A

Alexander J Gates

M

Maarten Sap

T

Thomas Hartvigsen

Format Sitasi

Zhou, R., Wan, G., Gabriel, S., Li, S., Gates, A.J., Sap, M. et al. (2025). Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English. https://arxiv.org/abs/2503.04099

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓