Semantic Scholar Open Access 2021 424 sitasi

Explaining neural scaling laws

Yasaman Bahri Ethan Dyer J. Kaplan Jaehoon Lee Utkarsh Sharma

Abstrak

Significance The population loss of trained deep neural networks has been empirically observed to improve as a power law in a variety of large models and datasets. We investigate the origins behind such “scaling laws” and provide a taxonomy for different scaling regimes. Our findings are based on derivations in linear random feature models—which, in addition to being a simple fruitful model, also describe the wide network limit of deep neural networks. We further formulate and verify aspects of scaling based on smoothness in interpolating a data manifold. We support our theory with empirical results in realistic settings. Our work provides insights into scaling laws and bridges the large gap between theory and experiment in modern deep learning.

Penulis (5)

Y

Yasaman Bahri

E

Ethan Dyer

J

J. Kaplan

J

Jaehoon Lee

U

Utkarsh Sharma

Format Sitasi

Bahri, Y., Dyer, E., Kaplan, J., Lee, J., Sharma, U. (2021). Explaining neural scaling laws. https://doi.org/10.1073/pnas.2311878121

Akses Cepat

Lihat di Sumber doi.org/10.1073/pnas.2311878121
Informasi Jurnal
Tahun Terbit
2021
Bahasa
en
Total Sitasi
424×
Sumber Database
Semantic Scholar
DOI
10.1073/pnas.2311878121
Akses
Open Access ✓