arXiv Open Access 2020

Deep orthogonal linear networks are shallow

Pierre Ablin

Lihat Sumber

Abstrak

We consider the problem of training a deep orthogonal linear network, which consists of a product of orthogonal matrices, with no non-linearity in-between. We show that training the weights with Riemannian gradient descent is equivalent to training the whole factorization by gradient descent. This means that there is no effect of overparametrization and implicit bias at all in this setting: training such a deep, overparametrized, network is perfectly equivalent to training a one-layer shallow network.

Topik & Kata Kunci

stat.ML cs.LG

Penulis (1)

Pierre Ablin

Format Sitasi

APA MLA BibTeX

Ablin, P. (2020). Deep orthogonal linear networks are shallow. https://arxiv.org/abs/2011.13831

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2020
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓