arXiv Open Access 2023

Predicting generalization performance with correctness discriminators

Yuekun Yao Alexander Koller
Lihat Sumber

Abstrak

The ability to predict an NLP model's accuracy on unseen, potentially out-of-distribution data is a prerequisite for trustworthiness. We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data. We achieve this by training a discriminator which predicts whether the output of a given sequence-to-sequence model is correct or not. We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds, and that these bounds are remarkably close together.

Topik & Kata Kunci

Penulis (2)

Y

Yuekun Yao

A

Alexander Koller

Format Sitasi

Yao, Y., Koller, A. (2023). Predicting generalization performance with correctness discriminators. https://arxiv.org/abs/2311.09422

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓