arXiv Open Access 2022

Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale

Laurent Sartran Samuel Barrett Adhiguna Kuncoro Miloš Stanojević Phil Blunsom +1 lainnya

Lihat Sumber

Abstrak

We introduce Transformer Grammars (TGs), a novel class of Transformer language models that combine (i) the expressive power, scalability, and strong performance of Transformers and (ii) recursive syntactic compositions, which here are implemented through a special attention mask and deterministic transformation of the linearized tree. We find that TGs outperform various strong baselines on sentence-level language modeling perplexity, as well as on multiple syntax-sensitive language modeling evaluation metrics. Additionally, we find that the recursive syntactic composition bottleneck which represents each sentence as a single vector harms perplexity on document-level language modeling, providing evidence that a different kind of memory mechanism -- one that is independent of composed syntactic representations -- plays an important role in current successful models of long text.

Topik & Kata Kunci

cs.CL

Penulis (6)

Laurent Sartran

Samuel Barrett

Adhiguna Kuncoro

Miloš Stanojević

Phil Blunsom

Chris Dyer

Format Sitasi

APA MLA BibTeX

Sartran, L., Barrett, S., Kuncoro, A., Stanojević, M., Blunsom, P., Dyer, C. (2022). Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale. https://arxiv.org/abs/2203.00633

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2022
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓