arXiv Open Access 2025

What Really is a Member? Discrediting Membership Inference via Poisoning

Neal Mangaokar Ashish Hooda Zhuohang Li Bradley A. Malin Kassem Fawaz +3 lainnya

Lihat Sumber

Abstrak

Membership inference tests aim to determine whether a particular data point was included in a language model's training set. However, recent works have shown that such tests often fail under the strict definition of membership based on exact matching, and have suggested relaxing this definition to include semantic neighbors as members as well. In this work, we show that membership inference tests are still unreliable under this relaxation - it is possible to poison the training dataset in a way that causes the test to produce incorrect predictions for a target point. We theoretically reveal a trade-off between a test's accuracy and its robustness to poisoning. We also present a concrete instantiation of this poisoning attack and empirically validate its effectiveness. Our results show that it can degrade the performance of existing tests to well below random.

Topik & Kata Kunci

cs.LG cs.CR

Penulis (8)

Neal Mangaokar

Ashish Hooda

Zhuohang Li

Bradley A. Malin

Kassem Fawaz

Somesh Jha

Atul Prakash

Amrita Roy Chowdhury

Format Sitasi

APA MLA BibTeX

Mangaokar, N., Hooda, A., Li, Z., Malin, B.A., Fawaz, K., Jha, S. et al. (2025). What Really is a Member? Discrediting Membership Inference via Poisoning. https://arxiv.org/abs/2506.06003

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓