arXiv Open Access 2023

RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer Acceleration

Lei Zhao Aishwarya Natarajan Luca Buonanno Archit Gajjar Ron M. Roth +4 lainnya
Lihat Sumber

Abstrak

Transformer models represent the cutting edge of Deep Neural Networks (DNNs) and excel in a wide range of machine learning tasks. However, processing these models demands significant computational resources and results in a substantial memory footprint. While In-memory Computing (IMC)offers promise for accelerating Vector-Matrix Multiplications(VMMs) with high computational parallelism and minimal data movement, employing it for other crucial DNN operators remains a formidable task. This challenge is exacerbated by the extensive use of complex activation functions, Softmax, and data-dependent matrix multiplications (DMMuls) within Transformer models. To address this challenge, we introduce a Reconfigurable Analog Computing Engine (RACE) by enhancing Analog Content Addressable Memories (ACAMs) to support broader operations. Based on the RACE, we propose the RACE-IT accelerator (meaning RACE for In-memory Transformers) to enable efficient analog-domain execution of all core operations of Transformer models. Given the flexibility of our proposed RACE in supporting arbitrary computations, RACE-IT is well-suited for adapting to emerging and non-traditional DNN architectures without requiring hardware modifications. We compare RACE-IT with various accelerators. Results show that RACE-IT increases performance by 453x and 15x, and reduces energy by 354x and 122x over the state-of-the-art GPUs and existing Transformer-specific IMC accelerators, respectively.

Topik & Kata Kunci

Penulis (9)

L

Lei Zhao

A

Aishwarya Natarajan

L

Luca Buonanno

A

Archit Gajjar

R

Ron M. Roth

S

Sergey Serebryakov

J

John Moon

J

Jim Ignowski

G

Giacomo Pedretti

Format Sitasi

Zhao, L., Natarajan, A., Buonanno, L., Gajjar, A., Roth, R.M., Serebryakov, S. et al. (2023). RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer Acceleration. https://arxiv.org/abs/2312.06532

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓