Semantic Scholar Open Access 2017 799 sitasi

SMASH: One-Shot Model Architecture Search through HyperNetworks

Andrew Brock Theodore Lim J. Ritchie Nick Weston

Abstrak

Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at this https URL

Penulis (4)

A

Andrew Brock

T

Theodore Lim

J

J. Ritchie

N

Nick Weston

Format Sitasi

Brock, A., Lim, T., Ritchie, J., Weston, N. (2017). SMASH: One-Shot Model Architecture Search through HyperNetworks. https://www.semanticscholar.org/paper/e56b10f7cd4bf037beac84da5925dc4544fab974

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2017
Bahasa
en
Total Sitasi
799×
Sumber Database
Semantic Scholar
Akses
Open Access ✓