Hasil untuk "cs.AR"

Menampilkan 20 dari ~184887 hasil · dari CrossRef, arXiv

JSON API
arXiv Open Access 2026
Annotated PIM Bibliography

Peter M. Kogge

Processing in Memory (PIM) and similar terms such as Compute In Memory (CIM), Logic in Memory (LIM), In Memory Computing (IMC), and Near Memory Computing (NMC) have gained attention recently as a potentially ``revolutionary new'' technique. The truth, however, is that many examples of the technology go back over 60 years. This document attempts to provide an annotated bibliography of PIM technology that attempts to cover the whole time-frame, and is organized to augment a forth-coming article.

en cs.AR
arXiv Open Access 2024
Floating Point HUB Adder for RISC-V Sargantana Processor

Gerardo Bandera, Javier Salamero, Miquel Moreto et al.

HUB format is an emerging technique to improve the hardware and time requirement when round to nearest is needed. On the other hand, RISC-V is an open-source ISA that many companies currently use in their designs. This paper presents a tailored floating point HUB adder implemented in the Sargantana RISC-V processor.

en cs.AR
arXiv Open Access 2024
Using a Performance Model to Implement a Superscalar CVA6

Côme Allart, Jean-Roch Coulon, André Sintzoff et al.

A performance model of CVA6 RISC-V processor is built to evaluate performance related modifications before implementing them in RTL. Its accuracy is 99.2% on CoreMark. This model is used to evaluate a superscalar feature for CVA6. During design phase, the model helped detecting and fixing performance bugs. The superscalar feature resulted in a CVA6 performance improvement of 40% on CoreMark.

en cs.AR
CrossRef Open Access 2021
High-fidelity modelling of Cs-Ar and Cs-Xe exciplex pumped alkali lasers with temperature-dependent energy pooling and ionization reactions

David L Carroll, Peter M Maggs

Abstract Parametric measurements of pulsed output energy from the four-level exciplex pumped alkali laser (XPAL) for Cs-Ar, Cs-Kr, and Cs-Xe as a function of input pump energy and temperature show a strong dependence on temperature. All three Cs-rare gas mixtures show a D 2 line laser performance increase with temperature towards a peak efficiency, followed by a decrease as temperature is increased beyond a peak performance point temperature. Prior simulations of Cs-Ar XPAL measurements indicated that energy pooling from the 6 2 P 3/2 state of Cs was significant at higher temperature and it was hypothesized that the addition of temperature-dependent reaction rates may be important. This paper presents new BLAZE Multiphysics™ simulations using temperature-dependent energy pooling reaction rates baselined to available experimental rate data. Also included are photoionization and Penning ionization reactions. These new calculations for Cs-Ar and Cs-Xe (Cs-Kr not yet simulated) show that the inclusion of temperature-dependent energy pooling rates and the subsequent onset of significant ionization can explain the rise and fall of XPAL performance with temperature with reasonable accuracy. Further, while Cs-Xe has a much stronger absorption characteristic than Cs-Ar, simulations show that the energy well present in the Cs-Xe B 2 Σ 1 / 2 + state increases the fraction of the Cs-Xe B-state relative to the Cs-Ar B-state, thereby resulting in energy output levels of Cs-Xe similar to that of Cs-Ar.

2 sitasi en
arXiv Open Access 2021
How Flexible is Your Computing System

Shihua Huang, Luc Waeijen, Henk Corporaal

In literature computer architectures are frequently claimed to be highly flexible, typically implying there exist trade-offs between flexibility and performance or energy efficiency. Processor flexibility, however, is not very sharply defined, and as such these claims can not be validated, nor can such hypothetical relations be fully understood and exploited in the design of computing systems. This paper is an attempt to introduce scientific rigour to the notion of flexibility in computing systems.

en cs.AR
arXiv Open Access 2021
Best CNTFET Ternary Adders?

Daniel Etiemble

The MUX implementation of ternary half adders and full adders using predecessor and successor functions lead to the most efficient efficient implementation using the smallest transistor count. These designs are compared with the binary implementation of the corresponding half adders and full adders using the MUX technique or the typical complementary CMOS circuit style. The transistor count ratio between ternary and binary implementations is always greater than the information ratio ($log_2(3)/log_2(2)$ = 1.585) between ternary and binary wires.

en cs.AR
arXiv Open Access 2020
How to extend the Single-Processor Paradigm to the Explicitly Many-Processor Approach

János Végh

The computing paradigm invented for processing a small amount of data on a single segregated processor cannot meet the challenges set by the present-day computing demands. The paper proposes a new computing paradigm (extending the old one to use several processors explicitly) and discusses some questions of its possible implementation. Some advantages of the implemented approach, illustrated with the results of a loosely-timed simulator, are presented.

en cs.AR
arXiv Open Access 2019
Coprocessors: failures and successes

Daniel Etiemble

The appearance and disappearance of coprocessors by integration into the CPU, the success or failure of coprocessors are examined by summarizing their characteristics from the mainframes of the 1960s. The coprocessors most particularly reviewed are the IBM 360 and CDC-6600 I/O processors, the Intel 8087 math coprocessor, the Cell processor, the Intel Xeon Phi coprocessors, the GPUs, the FPGAs, and the coprocessors of manycores SW26010 and Pezy SC-2 used in high-ranked supercomputers in the TOP500 or Green500. The conditions for a coprocessor to be viable in the medium or long-term are defined.

en cs.AR
arXiv Open Access 2018
Trends in Processor Architecture

Antonio Gonzalez

This paper presents an overview of the main trends in processor architecture. It starts with an analysis of the past evolution of processors and the main driving forces behind it, and then it focuses on a description of the main architectural features of current processors. Finally, it presents a discussion on some promising directions for future evolution of processor architectures.

en cs.AR
arXiv Open Access 2018
Is Leakage Power a Linear Function of Temperature?

Hameedah Sultan, Shashank Varshney, Smruti R Sarangi

In this work, we present a study of the leakage power modeling techniques commonly used in the architecture community. We further provide an analysis of the error in leakage power estimation using the various modeling techniques. We strongly believe that this study will help researchers determine an appropriate leakage model to use in their work, based on the desired modeling accuracy and speed.

en cs.AR
arXiv Open Access 2016
GRVI Phalanx: A Massively Parallel RISC-V FPGA Accelerator Accelerator

Jan Gray

GRVI is an FPGA-efficient RISC-V RV32I soft processor. Phalanx is a parallel processor and accelerator array framework. Groups of processors and accelerators form shared memory clusters. Clusters are interconnected with each other and with extreme bandwidth I/O and memory devices by a 300- bit-wide Hoplite NOC. An example Kintex UltraScale KU040 system has 400 RISC-V cores, peak throughput of 100,000 MIPS, peak shared memory bandwidth of 600 GB/s, NOC bisection bandwidth of 700 Gbps, and uses 13 W.

en cs.AR
arXiv Open Access 2014
FPGA design of a cdma2000 turbo decoder

Maribell Sacanamboy Franco, Fabio G. Guerrero

This paper presents the FPGA hardware design of a turbo decoder for the cdma2000 standard. The work includes a study and mathematical analysis of the turbo decoding process, based on the MAX-Log-MAP algorithm. Results of decoding for a packet size of two hundred fifty bits are presented, as well as an analysis of area versus performance, and the key variables for hardware design in turbo decoding.

en cs.AR
arXiv Open Access 2011
SoC Software Components Diagnosis Technology

Svetlana Chumachenko, Wajeb Gharibi, Anna Hahanova et al.

A novel approach to evaluation of hardware and software testability, represented in the form of register transfer graph, is proposed. Instances of making of software graph models for their subsequent testing and diagnosis are shown.

arXiv Open Access 2010
Asynchronous logic circuits and sheaf obstructions

Michael Robinson

This article exhibits a particular encoding of logic circuits into a sheaf formalism. The central result of this article is that there exists strictly more information available to a circuit designer in this setting than exists in static truth tables, but less than exists in event-level simulation. This information is related to the timing behavior of the logic circuits, and thereby provides a ``bridge'' between static logic analysis and detailed simulation.

en cs.AR

Halaman 2 dari 9245