arXiv Open Access 2024

Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence

Abhinav Patil Jaap Jumelet Yu Ying Chiu Andy Lapastora Peter Shen +3 lainnya

Lihat Sumber

Abstrak

This paper introduces Filtered Corpus Training, a method that trains language models (LMs) on corpora with certain linguistic constructions filtered out from the training data, and uses it to measure the ability of LMs to perform linguistic generalization on the basis of indirect evidence. We apply the method to both LSTM and Transformer LMs (of roughly comparable size), developing filtered corpora that target a wide range of linguistic phenomena. Our results show that while transformers are better qua LMs (as measured by perplexity), both models perform equally and surprisingly well on linguistic generalization measures, suggesting that they are capable of generalizing from indirect evidence.

Topik & Kata Kunci

cs.CL cs.AI cs.LG

Penulis (8)

Abhinav Patil

Jaap Jumelet

Yu Ying Chiu

Andy Lapastora

Peter Shen

Lexie Wang

Clevis Willrich

Shane Steinert-Threlkeld

Format Sitasi

APA MLA BibTeX

Patil, A., Jumelet, J., Chiu, Y.Y., Lapastora, A., Shen, P., Wang, L. et al. (2024). Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence. https://arxiv.org/abs/2405.15750

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓