arXiv Open Access 2024

Dynamic Universal Approximation Theory: Foundations for Parallelism in Neural Networks

Wei Wang Qing Li

Lihat Sumber

Abstrak

Neural networks are increasingly evolving towards training large models with big data, a method that has demonstrated superior performance across many tasks. However, this approach introduces an urgent problem: current deep learning models are predominantly serial, meaning that as the number of network layers increases, so do the training and inference times. This is unacceptable if deep learning is to continue advancing. Therefore, this paper proposes a deep learning parallelization strategy based on the Universal Approximation Theorem (UAT). From this foundation, we designed a parallel network called Para-Former to test our theory. Unlike traditional serial models, the inference time of Para-Former does not increase with the number of layers, significantly accelerating the inference speed of multi-layer networks. Experimental results validate the effectiveness of this network.

Topik & Kata Kunci

cs.LG cs.AI

Penulis (2)

Wei Wang

Qing Li

Format Sitasi

APA MLA BibTeX

Wang, W., Li, Q. (2024). Dynamic Universal Approximation Theory: Foundations for Parallelism in Neural Networks. https://arxiv.org/abs/2407.21670

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓