DOAJ Open Access 2024

Contrastive adversarial gender debiasing

Nicolás Torres

Abstrak

This research contributes a comprehensive analysis of gender bias within contemporary AI language models, specifically examining iterations of the GPT series, alongside Gemini and Llama. The study offers a systematic investigation, encompassing multiple experiments spanning sentence completions, generative narratives, bilingual analysis, and visual perception assessments. The primary objective is to scrutinize the evolution of gender bias in these models across iterations, explore biases in professions and contexts, and evaluate multilingual disparities. Notably, the analyses reveal a marked evolution in GPT iterations, with GPT4 showcasing significantly reduced or negligible biases, signifying substantial advancements in bias mitigation. Professions and contexts exhibit model biases, indicating associations with specific genders. Multilingual evaluations demonstrate subtle disparities in gender bias tendencies between English and Spanish narratives. To effectively mitigate these biases, we propose a novel Contrastive Adversarial Gender Debiasing (CAGD) method that synergistically combines contrastive learning and adversarial training techniques. The CAGD method enables language models to learn gender-neutral representations while promoting robustness against gender biases, consistently outperforming original and adversarially debiased models across various tasks and metrics. These findings underscore the complexity of gender bias in AI language models, emphasizing the need for continual bias mitigation strategies, such as the proposed CAGD approach, and ethical considerations in AI development and deployment.

Penulis (1)

N

Nicolás Torres

Format Sitasi

Torres, N. (2024). Contrastive adversarial gender debiasing. https://doi.org/10.1016/j.nlp.2024.100092

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1016/j.nlp.2024.100092
Informasi Jurnal
Tahun Terbit
2024
Sumber Database
DOAJ
DOI
10.1016/j.nlp.2024.100092
Akses
Open Access ✓