arXiv Open Access 2023

A calibrated BISG for inferring race from surname and geolocation

Philip Greengard Andrew Gelman
Lihat Sumber

Abstrak

Bayesian Improved Surname Geocoding (BISG) is a ubiquitous tool for predicting race and ethnicity using an individual's geolocation and surname. Here we demonstrate that statistical dependence of surname and geolocation within racial/ethnic categories in the United States results in biases for minority subpopulations, and we introduce a raking-based improvement. Our method augments the data used by BISG--distributions of race by geolocation and race by surname--with the distribution of surname by geolocation obtained from state voter files. We validate our algorithm on state voter registration lists that contain self-identified race/ethnicity.

Topik & Kata Kunci

Penulis (2)

P

Philip Greengard

A

Andrew Gelman

Format Sitasi

Greengard, P., Gelman, A. (2023). A calibrated BISG for inferring race from surname and geolocation. https://arxiv.org/abs/2304.09126

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓