Semantic Scholar Open Access 2020 2 sitasi

Special issue on SoC and AI processors

Ji-Hoon Kim Minjae Lee Jongsun Park H. Cha

Abstrak

Artificial Intelligence (AI) has evolved into a general technology for a wide range of purposes and has been applied in all aspects of economy and society. It has already been extensively used in various fields, including medical services, finance, security, education, transportation, and logistics, and had led to the emergence of new commercial activities, business models, and game-changing product applications. AI is a driving force to economic and social development at the forefront of the technological revolution and industrial transformation. Additionally, System-on-a-Chip (SoC) plays a vital role in post-PC era products like smartphones, tablets, and various wearable devices where form-factor, cost, and energy-efficiency, are critical drivers. It contains multiple processing parts such as the central processing unit (CPU), graphics processing unit (GPU), image processing unit (IPU), digital signal processor (DSP), video encoder/decoder, modems, and neural processing unit (NPU). Specifically, AI processors, another name for NPU, are specially optimized for mathematics and algorithms commonly used by neural networks. They can run neural networks and machine learning tasks faster and more efficiently than CPUs. In this special issue, we have selected papers that represent the current state-of-the-art in AI processors as well as in essential SoC blocks used in radar, RF/analog, hardware security, and design methodology. The first paper “40TFLOPS Artificial Intelligence Processor with Function-safe Programmable Many-Cores for ISO26262 ASIL-D” by Jinho Han et al. presents AI processor architecture that has high throughput for accelerating the neural network and reducing the required external memory bandwidth for processing the neural network. For high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at 1.2 GHz clock frequency and the general-purpose processor (GPP) core is integrated for the control of the STC and processing AI algorithm. For the functional safety that becomes very important in automotive systems, various microarchitectural techniques are adopted, including the self-recovering cache and dynamic lockstep (DLS) function, to achieve ASIL-D of ISO26262 standard fault tolerance levels. The entire AI processor fabricated in the 28-nm CMOS process yields peak performance up to 40TFLOPS at 1.2 GHz operating frequency and 1.1 V supply voltage, with a measured energy efficiency of 1.3TOPS/W and ISO26262 ASIL-D compliant, single-point fault tolerance rate equal to 99.64%. The next paper titled “An impulse radio (IR) radar SoC for through-the-wall human-detection applications” by Piljae Park et al. proposes through-the-wall radar (TTWR) SoC and its architecture with the test standards and methods, which can be used at disaster scenes in limited visibility conditions owing to smoke, walls, and collapse debris. Additive reception based on the coherent clocks and reconfigurability can fulfill the demands for the TTWR and a clock-based single-chip IR radar transceiver is implemented in 130-nm CMOS technology. By utilizing the repetitive coherent clock schemes, the proposed SoC can achieve signal-to-noise-ratio (SNR) enhancements. Furthermore, this paper shows the test results in various pseudo-disaster conditions of the hand-held prototype radar with the proposed TTWR SoC operating in real-time. The third paper “AB9: a Neural Processor for Inference Acceleration” by Yong Cheol Peter Cho et al. presents a neural processor for interference acceleration with the systolic tensor core (STC) by exploiting data-reuse and parallelism characteristics inherent in neural networks, while also providing fast access to large on-chip memory. AB9 shows a superior performance and power efficiency to those of a general-purpose GPU (GPGPU) for YOLOv2, and has been fabricated with a 28-nm CMOS process along with a 40 TFLOP STC that includes 32 k arithmetic units and over 36 MB of on-chip SRAM. To alleviate the high-computational and memory-intensive burdens in deep neural networks, the following paper “Automated Optimization for Memory-efficient High Performance Deep Neural Network Accelerators” by Hyun Mi Kim et al. investigates the efficient memory structure and operating scheme, which can provide an intuitive solution for high-performance accelerators along with dataflow control. The authors propose an efficient architecture with flexibility, while operating at high frequency despite the large memory size and PE array. They demonstrate an improvement in the efficiency and usability of the architecture by presenting

Topik & Kata Kunci

Penulis (4)

J

Ji-Hoon Kim

M

Minjae Lee

J

Jongsun Park

H

H. Cha

Format Sitasi

Kim, J., Lee, M., Park, J., Cha, H. (2020). Special issue on SoC and AI processors. https://doi.org/10.4218/etr2.12316

Akses Cepat

Lihat di Sumber doi.org/10.4218/etr2.12316
Informasi Jurnal
Tahun Terbit
2020
Bahasa
en
Total Sitasi
Sumber Database
Semantic Scholar
DOI
10.4218/etr2.12316
Akses
Open Access ✓