arXiv Open Access 2025

Measuring Stereotype and Deviation Biases in Large Language Models

Daniel Wang Eli Brignac Minjia Mao Xiao Fang

Lihat Sumber

Abstrak

Large language models (LLMs) are widely applied across diverse domains, raising concerns about their limitations and potential risks. In this study, we investigate two types of bias that LLMs may display: stereotype bias and deviation bias. Stereotype bias refers to when LLMs consistently associate specific traits with a particular demographic group. Deviation bias reflects the disparity between the demographic distributions extracted from LLM-generated content and real-world demographic distributions. By asking four advanced LLMs to generate profiles of individuals, we examine the associations between each demographic group and attributes such as political affiliation, religion, and sexual orientation. Our experimental results show that all examined LLMs exhibit both significant stereotype bias and deviation bias towards multiple groups. Our findings uncover the biases that occur when LLMs infer user attributes and shed light on the potential harms of LLM-generated outputs.

Topik & Kata Kunci

cs.CL

Penulis (4)

Daniel Wang

Eli Brignac

Minjia Mao

Xiao Fang

Format Sitasi

APA MLA BibTeX

Wang, D., Brignac, E., Mao, M., Fang, X. (2025). Measuring Stereotype and Deviation Biases in Large Language Models. https://arxiv.org/abs/2508.06649

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓