Semantic Scholar Open Access 2023 16 sitasi

Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling

Matúš Pikuliak Andrea Hrckova Stefan Oresko Marián Simko

Abstrak

We present GEST -- a new manually created dataset designed to measure gender-stereotypical reasoning in language models and machine translation systems. GEST contains samples for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders) that are compatible with the English language and 9 Slavic languages. The definition of said stereotypes was informed by gender experts. We used GEST to evaluate English and Slavic masked LMs, English generative LMs, and machine translation systems. We discovered significant and consistent amounts of gender-stereotypical reasoning in almost all the evaluated models and languages. Our experiments confirm the previously postulated hypothesis that the larger the model, the more stereotypical it usually is.

Topik & Kata Kunci

Penulis (4)

M

Matúš Pikuliak

A

Andrea Hrckova

S

Stefan Oresko

M

Marián Simko

Format Sitasi

Pikuliak, M., Hrckova, A., Oresko, S., Simko, M. (2023). Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling. https://doi.org/10.48550/arXiv.2311.18711

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.48550/arXiv.2311.18711
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
16×
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2311.18711
Akses
Open Access ✓