arXiv Open Access 2023

Risk Scores, Label Bias, and Everything but the Kitchen Sink

Michael Zanger-Tishler Julian Nyarko Sharad Goel
Lihat Sumber

Abstrak

In designing risk assessment algorithms, many scholars promote a "kitchen sink" approach, reasoning that more information yields more accurate predictions. We show, however, that this rationale often fails when algorithms are trained to predict a proxy of the true outcome, as is typically the case. With such "label bias", one should exclude a feature if its correlation with the proxy and its correlation with the true outcome have opposite signs, conditional on the other model features. This criterion is often satisfied when a feature is weakly correlated with the true outcome, and, additionally, that feature and the true outcome are both direct causes of the remaining features. For example, due to patterns of police deployment, criminal behavior and geography may be weakly correlated and direct causes of one's criminal record, suggesting one should exclude geography in criminal risk assessments trained to predict arrest as a proxy for behavior.

Topik & Kata Kunci

Penulis (3)

M

Michael Zanger-Tishler

J

Julian Nyarko

S

Sharad Goel

Format Sitasi

Zanger-Tishler, M., Nyarko, J., Goel, S. (2023). Risk Scores, Label Bias, and Everything but the Kitchen Sink. https://arxiv.org/abs/2305.12638

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓