Semantic Scholar Open Access 2023 133 sitasi

Zoology: Measuring and Improving Recall in Efficient Language Models

Simran Arora Sabri Eyuboglu Aman Timalsina Isys Johnson Michael Poli +3 lainnya

Abstrak

Attention-free language models that combine gating and convolutions are growing in popularity due to their efficiency and increasingly competitive performance. To better understand these architectures, we pretrain a suite of 17 attention and"gated-convolution"language models, finding that SoTA gated-convolution architectures still underperform attention by up to 2.1 perplexity points on the Pile. In fine-grained analysis, we find 82% of the gap is explained by each model's ability to recall information that is previously mentioned in-context, e.g."Hakuna Matata means no worries Hakuna Matata it means no"$\rightarrow$"??". On this task, termed"associative recall", we find that attention outperforms gated-convolutions by a large margin: a 70M parameter attention model outperforms a 1.4 billion parameter gated-convolution model on associative recall. This is surprising because prior work shows gated convolutions can perfectly solve synthetic tests for AR capability. To close the gap between synthetics and real language, we develop a new formalization of the task called multi-query associative recall (MQAR) that better reflects actual language. We perform an empirical and theoretical study of MQAR that elucidates differences in the parameter-efficiency of attention and gated-convolution recall. Informed by our analysis, we evaluate simple convolution-attention hybrids and show that hybrids with input-dependent sparse attention patterns can close 97.4% of the gap to attention, while maintaining sub-quadratic scaling. Our code is accessible at: https://github.com/HazyResearch/zoology.

Topik & Kata Kunci

Penulis (8)

S

Simran Arora

S

Sabri Eyuboglu

A

Aman Timalsina

I

Isys Johnson

M

Michael Poli

J

James Zou

A

Atri Rudra

C

Christopher R'e

Format Sitasi

Arora, S., Eyuboglu, S., Timalsina, A., Johnson, I., Poli, M., Zou, J. et al. (2023). Zoology: Measuring and Improving Recall in Efficient Language Models. https://doi.org/10.48550/arXiv.2312.04927

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.48550/arXiv.2312.04927
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
133×
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2312.04927
Akses
Open Access ✓