arXiv Open Access 2020

Selecting Informative Contexts Improves Language Model Finetuning

Richard Antonello Nicole Beckage Javier Turek Alexander Huth

Lihat Sumber

Abstrak

Language model fine-tuning is essential for modern natural language processing, but is computationally expensive and time-consuming. Further, the effectiveness of fine-tuning is limited by the inclusion of training examples that negatively affect performance. Here we present a general fine-tuning method that we call information gain filtration for improving the overall training efficiency and final performance of language model fine-tuning. We define the information gain of an example as the improvement on a test metric after training on that example. A secondary learner is then trained to approximate this quantity. During fine-tuning, this learner selects informative examples and skips uninformative ones. We show that our method has consistent improvement across datasets, fine-tuning tasks, and language model architectures. For example, we achieve a median perplexity of 54.0 on a books dataset compared to 57.3 for standard fine-tuning. We present statistical evidence that offers insight into the improvements of our method over standard fine-tuning. The generality of our method leads us to propose a new paradigm for language model fine-tuning -- we encourage researchers to release pretrained secondary learners on common corpora to promote efficient and effective fine-tuning, thereby improving the performance and reducing the overall energy footprint of language model fine-tuning.

Topik & Kata Kunci

cs.CL

Penulis (4)

Richard Antonello

Nicole Beckage

Javier Turek

Alexander Huth

Format Sitasi

APA MLA BibTeX

Antonello, R., Beckage, N., Turek, J., Huth, A. (2020). Selecting Informative Contexts Improves Language Model Finetuning. https://arxiv.org/abs/2005.00175

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2020
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓