Semantic Scholar Open Access 2017 947 sitasi

Deep Learning Scaling is Predictable, Empirically

Joel Hestness Sharan Narang Newsha Ardalani G. Diamos Heewoo Jun +4 lainnya

Lihat Sumber

Abstrak

Deep learning (DL) creates impactful advances following a virtuous recipe: model architecture search, creating large training data sets, and scaling computation. It is widely believed that growing training sets and models should improve accuracy and result in better products. As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art. This paper presents a large scale empirical characterization of generalization error and model size growth as training sets grow. We introduce a methodology for this measurement and test four machine learning domains: machine translation, language modeling, image processing, and speech recognition. Our empirical results show power-law generalization error scaling across a breadth of factors, resulting in power-law exponents---the "steepness" of the learning curve---yet to be explained by theoretical work. Further, model improvements only shift the error but do not appear to affect the power-law exponent. We also show that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.

Topik & Kata Kunci

Mathematics Computer Science

Penulis (9)

Joel Hestness

Sharan Narang

Newsha Ardalani

G. Diamos

Heewoo Jun

Hassan Kianinejad

Md. Mostofa Ali Patwary

Yang Yang

Yanqi Zhou

Format Sitasi

APA MLA BibTeX

Hestness, J., Narang, S., Ardalani, N., Diamos, G., Jun, H., Kianinejad, H. et al. (2017). Deep Learning Scaling is Predictable, Empirically. https://www.semanticscholar.org/paper/a1c922be467d1c0c64b963e65dae41778b81b2a0

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2017
Bahasa: en
Total Sitasi: 947×
Sumber Database: Semantic Scholar
Akses: Open Access ✓