Semantic Scholar Open Access 2021 424 sitasi

Explaining neural scaling laws

Yasaman Bahri Ethan Dyer J. Kaplan Jaehoon Lee Utkarsh Sharma

Lihat Sumber DOI

Abstrak

Significance The population loss of trained deep neural networks has been empirically observed to improve as a power law in a variety of large models and datasets. We investigate the origins behind such “scaling laws” and provide a taxonomy for different scaling regimes. Our findings are based on derivations in linear random feature models—which, in addition to being a simple fruitful model, also describe the wide network limit of deep neural networks. We further formulate and verify aspects of scaling based on smoothness in interpolating a data manifold. We support our theory with empirical results in realistic settings. Our work provides insights into scaling laws and bridges the large gap between theory and experiment in modern deep learning.

Topik & Kata Kunci

Computer Science Physics Mathematics Medicine

Penulis (5)

Yasaman Bahri

Ethan Dyer

J. Kaplan

Jaehoon Lee

Utkarsh Sharma

Format Sitasi

APA MLA BibTeX

Bahri, Y., Dyer, E., Kaplan, J., Lee, J., Sharma, U. (2021). Explaining neural scaling laws. https://doi.org/10.1073/pnas.2311878121

Akses Cepat

Lihat di Sumber doi.org/10.1073/pnas.2311878121

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Total Sitasi: 424×
Sumber Database: Semantic Scholar
DOI: 10.1073/pnas.2311878121
Akses: Open Access ✓