Gaze-adaptive neural pre-correction for mitigating spatially varying optical aberrations in near-eye displays
Yi Jiang, Ye Bi, Yinng Li
et al.
Near-eye display (NED) technology constitutes a fundamental component of head-mounted display (HMD) systems. The compact form factor required by HMDs imposes stringent constraints on optical design, often resulting in pronounced wavefront aberrations that significantly degrade visual fidelity. In addition, natural eye movements dynamically induce varying blur that further compromises image quality. To mitigate these challenges, a gaze-contingent neural network framework has been developed to compensate for aberrations within the foveal region. The network is trained in an end-to-end manner to minimize the discrepancy between the optically degraded system output and the corresponding ground truth image. A forward imaging model is employed, in which the network output is convolved with a spatially varying point spread function (PSF) to accurately simulate the degradation introduced by the optical system. To accommodate dynamic changes in gaze direction, a foveated attention-guided module is incorporated to adaptively modulate the pre-correction process, enabling localized compensation centered on the fovea. Additionally, an end-to-end trainable architecture has been designed to integrate gaze-informed blur priors. Both simulation and experimental validations confirm that the proposed method substantially reduces gaze-dependent aberrations and enhances retinal image clarity within the foveal region, while maintaining high computational efficiency. The presented framework offers a practical and scalable solution for improving visual performance in aberration-sensitive NED systems.
Computer engineering. Computer hardware, Electronic computers. Computer science
An information theoretic limit to data amplification
S J Watts, L Crow
In recent years generative artificial intelligence has been used to create data to support scientific analysis. For example, generative adversarial networks (GANs) have been trained using Monte Carlo simulated input and then used to generate data for the same problem. This has the advantage that a GAN creates data in a significantly reduced computing time. $N$ training events for a GAN can result in $NG$ generated events with the gain factor $G$ being greater than one. This appears to violate the principle that one cannot get information for free. This is not the only way to amplify data so this process will be referred to as data amplification which is studied using information theoretic concepts. It is shown that a gain greater than one is possible whilst keeping the information content of the data unchanged. This leads to a mathematical bound, $2\log (\text{Generated}\ \text{Events}) \unicode{x2A7E} {\text{3log(Training Events)}}$ , which only depends on the number of generated and training events. This study determined the conditions for both the underlying and reconstructed probability distributions to ensure this bound. In particular, the resolution of variables in amplified data is not improved by the process but the increase in sample size can still improve statistical significance. The bound was confirmed using computer simulation and analysis of GAN generated data from the literature.
Computer engineering. Computer hardware, Electronic computers. Computer science
Machine Learning–Based Prediction of Organic Solar Cell Performance Using Molecular Descriptors
Mohammed Saleh Alshaikh
The performance of Organic Solar Cells (OSCs) is intrinsically linked to the molecular, electronic, and structural properties of donor and acceptor materials. This study employs various machine learning techniques, namely the Generalized Regression Neural Network (GRNN), Support Vector Machine (SVM), and Tree Boost, to predict key performance metrics of OSCs, including power conversion efficiency (PCE), short-circuit current density (JSC), open-circuit voltage (VOC), and fill factor (FF). The models are trained and evaluated using an experimentally reported dataset compiled by Sahu et al. Correlation analysis demonstrates that material characteristics such as polarizability, bandgap, dipole moment, and charge transfer are statistically associated with OSC performance. The predictive performance of the GRNN model is compared with that of the SVM and Tree Boost models, showing consistently lower prediction errors within the considered dataset. In addition, sensitivity analysis is performed to assess the relative importance of the predictor variables and to examine the influence of kernel functions on GRNN performance. The results indicate that machine learning models, particularly GRNN, can serve as effective data-driven tools for predicting the performance of organic solar cells and for supporting computational screening studies.
Transportation engineering, Systems engineering
Distributed and heterogeneous tensor-vector contraction algorithms for high performance computing
Pedro J. Martinez-Ferrer, Albert-Jan Yzelman, Vicenç Beltran
The tensor-vector contraction (TVC) is the most memory-bound operation of its class and a core component of the higher-order power method (HOPM). This paper brings distributed-memory parallelization to a native TVC algorithm for dense tensors that overall remains oblivious to contraction mode, tensor splitting and tensor order. Similarly, we propose a novel distributed HOPM, namely dHOPM3, that can save up to one order of magnitude of streamed memory and is about twice as costly in terms of data movement as a distributed TVC operation (dTVC) when using task-based parallelization. The numerical experiments carried out in this work on three different architectures featuring multi-core and accelerators confirm that the performances of dTVC and dHOPM3 remain relatively close to the peak system memory bandwidth (50%-80%, depending on the architecture) and on par with STREAM benchmark figures. On strong scalability scenarios, our native multi-core implementations of these two algorithms can achieve similar and sometimes even greater performance figures than those based upon state-of-the-art CUDA batched kernels. Finally, we demonstrate that both computation and communication can benefit from mixed precision arithmetic also in cases where the hardware does not support low precision data types natively.
Quantum Software Engineering and Potential of Quantum Computing in Software Engineering Research: A Review
Ashis Kumar Mandal, Md Nadim, Chanchal K. Roy
et al.
Research in software engineering is essential for improving development practices, leading to reliable and secure software. Leveraging the principles of quantum physics, quantum computing has emerged as a new computational paradigm that offers significant advantages over classical computing. As quantum computing progresses rapidly, its potential applications across various fields are becoming apparent. In software engineering, many tasks involve complex computations where quantum computers can greatly speed up the development process, leading to faster and more efficient solutions. With the growing use of quantum-based applications in different fields, quantum software engineering (QSE) has emerged as a discipline focused on designing, developing, and optimizing quantum software for diverse applications. This paper aims to review the role of quantum computing in software engineering research and the latest developments in QSE. To our knowledge, this is the first comprehensive review on this topic. We begin by introducing quantum computing, exploring its fundamental concepts, and discussing its potential applications in software engineering. We also examine various QSE techniques that expedite software development. Finally, we discuss the opportunities and challenges in quantum-driven software engineering and QSE. Our study reveals that quantum machine learning (QML) and quantum optimization have substantial potential to address classical software engineering tasks, though this area is still limited. Current QSE tools and techniques lack robustness and maturity, indicating a need for more focus. One of the main challenges is that quantum computing has yet to reach its full potential.
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Chenggang Zhao, Chengqi Deng, Chong Ruan
et al.
The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconnection bandwidth. DeepSeek-V3, trained on 2,048 NVIDIA H800 GPUs, demonstrates how hardware-aware model co-design can effectively address these challenges, enabling cost-efficient training and inference at scale. This paper presents an in-depth analysis of the DeepSeek-V3/R1 model architecture and its AI infrastructure, highlighting key innovations such as Multi-head Latent Attention (MLA) for enhanced memory efficiency, Mixture of Experts (MoE) architectures for optimized computation-communication trade-offs, FP8 mixed-precision training to unlock the full potential of hardware capabilities, and a Multi-Plane Network Topology to minimize cluster-level network overhead. Building on the hardware bottlenecks encountered during DeepSeek-V3's development, we engage in a broader discussion with academic and industry peers on potential future hardware directions, including precise low-precision computation units, scale-up and scale-out convergence, and innovations in low-latency communication fabrics. These insights underscore the critical role of hardware and model co-design in meeting the escalating demands of AI workloads, offering a practical blueprint for innovation in next-generation AI systems.
Morescient GAI for Software Engineering (Extended Version)
Marcus Kessel, Colin Atkinson
The ability of Generative AI (GAI) technology to automatically check, synthesize and modify software engineering artifacts promises to revolutionize all aspects of software engineering. Using GAI for software engineering tasks is consequently one of the most rapidly expanding fields of software engineering research, with over a hundred LLM-based code models having been published since 2021. However, the overwhelming majority of existing code models share a major weakness - they are exclusively trained on the syntactic facet of software, significantly lowering their trustworthiness in tasks dependent on software semantics. To address this problem, a new class of "Morescient" GAI is needed that is "aware" of (i.e., trained on) both the semantic and static facets of software. This, in turn, will require a new generation of software observation platforms capable of generating large quantities of execution observations in a structured and readily analyzable way. In this paper, we present a vision and roadmap for how such "Morescient" GAI models can be engineered, evolved and disseminated according to the principles of open science.
Development of IoT Smart Greenhouse System for Hydroponic Gardens
Arcel Christian Austria, John Simon Fabros, Kurt Russel Sumilang
et al.
Computer engineering. Computer hardware, Information technology
uBrain: a unary brain computer interface
Di Wu, Jingjie Li, Zhewen Pan
et al.
Brain computer interfaces (BCIs) have been widely adopted to enhance human perception via brain signals with abundant spatial-temporal dynamics, such as electroencephalogram (EEG). In recent years, BCI algorithms are moving from classical feature engineering to emerging deep neural networks (DNNs), allowing to identify the spatial-temporal dynamics with improved accuracy. However, existing BCI architectures are not leveraging such dynamics for hardware efficiency. In this work, we present uBrain, a unary computing BCI architecture for DNN models with cascaded convolutional and recurrent neural networks to achieve high task capability and hardware efficiency. uBrain co-designs the algorithm and hardware: the DNN architecture and the hardware architecture are optimized with customized unary operations and immediate signal processing after sensing, respectively. Experiments show that uBrain, with negligible accuracy loss, surpasses the CPU, systolic array and stochastic computing baselines in on-chip power efficiency by 9.0×, 6.2× and 2.0×.
24 sitasi
en
Computer Science
A Review of Computer Microvision-Based Precision Motion Measurement: Principles, Characteristics, and Applications
Sheng Yao, Hai Li, Shuiquan Pang
et al.
Microengineering/nanoengineering is an emerging field that enables engineering and scientific discoveries in the microworld. As an effective and powerful tool for automation and manipulation at small scales, precision motion measurement by computer microvision is now broadly accepted and widely used in microengineering/nanoengineering. Unlike other measurement methods, the vision-based techniques can intuitively visualize the measuring process with high interactivity, expansibility, and flexibility. This article aims to comprehensively present a survey of microvision-based motion measurement from the collective experience. Working principles of microvision systems are first introduced and described, where the hardware configuration, model calibration, and motion measurement algorithms are systematically summarized. The characteristics and performances of different microvision-based methods are then analyzed and discussed in terms of measurement resolution, range, degree of freedom, efficiency, and error sources. Recent advances of applications empowered by the developed computer microvision-based methods are also presented. The review can be helpful to researchers who engage in the development of microvision-based techniques and provide the recent state and tendency for the research community of vision-based measurement, manipulation, and automation at microscale/nanoscale.
47 sitasi
en
Computer Science
Darwin: a neuromorphic hardware co-processor based on Spiking Neural Networks
De Ma, Juncheng Shen, Zonghua Gu
et al.
238 sitasi
en
Computer Science
Reliability analysis of discrete-state performance functions via adaptive sequential sampling with detection of failure surfaces
Miroslav Vořechovský
The paper presents a new efficient and robust method for rare event probability estimation for computational models of an engineering product or a process returning categorical information only, for example, either success or failure. For such models, most of the methods designed for the estimation of failure probability, which use the numerical value of the outcome to compute gradients or to estimate the proximity to the failure surface, cannot be applied. Even if the performance function provides more than just binary output, the state of the system may be a non-smooth or even a discontinuous function defined in the domain of continuous input variables. In these cases, the classical gradient-based methods usually fail. We propose a simple yet efficient algorithm, which performs a sequential adaptive selection of points from the input domain of random variables to extend and refine a simple distance-based surrogate model. Two different tasks can be accomplished at any stage of sequential sampling: (i) estimation of the failure probability, and (ii) selection of the best possible candidate for the subsequent model evaluation if further improvement is necessary. The proposed criterion for selecting the next point for model evaluation maximizes the expected probability classified by using the candidate. Therefore, the perfect balance between global exploration and local exploitation is maintained automatically. The method can estimate the probabilities of multiple failure types. Moreover, when the numerical value of model evaluation can be used to build a smooth surrogate, the algorithm can accommodate this information to increase the accuracy of the estimated probabilities. Lastly, we define a new simple yet general geometrical measure of the global sensitivity of the rare-event probability to individual variables, which is obtained as a by-product of the proposed algorithm.
OneQ: A Compilation Framework for Photonic One-Way Quantum Computation
Hezi Zhang, Anbang Wu, Yuke Wang
et al.
In this paper, we propose OneQ, the first optimizing compilation framework for one-way quantum computation towards realistic photonic quantum architectures. Unlike previous compilation efforts for solid-state qubit technologies, our innovative framework addresses a unique set of challenges in photonic quantum computing. Specifically, this includes the dynamic generation of qubits over time, the need to perform all computation through measurements instead of relying on 1-qubit and 2-qubit gates, and the fact that photons are instantaneously destroyed after measurements. As pioneers in this field, we demonstrate the vast optimization potential of photonic one-way quantum computing, showcasing the remarkable ability of OneQ to reduce computing resource requirements by orders of magnitude.
Bringing Green Software to Computer Science Curriculum: Perspectives from Researchers and Educators
J. Saraiva, Ziliang Zong, Rui Pereira
Only recently has the software engineering community started conducting research on developing energy efficient software, or green software. This is shadowed when compared to the research already produced in the computer hardware community. While research in green software is rapidly increasing, several recent studies with software engineers show that they still miss techniques, knowledge, and tools to develop greener software. Indeed, all such studies suggest that green software should be part of a modern Computer Science Curriculum. In this paper, we present survey results from both researchers' and educators' perspective on green software education. These surveys confirm the lack of courses and educational material for teaching green software in current higher education. Additionally, we highlight three key pedagogical challenges in bringing green software to computer science curriculum and discussed existing solutions to address these key challenges. We firmly believe that 'green thinking" and the broad adoption of green software in computer science curriculum can greatly benefit our environment, society, and students in an era where software is everywhere and evolves in an unprecedented speed.
18 sitasi
en
Computer Science
Vertically Integrated Computing Labs Using Open-Source Hardware Generators and Cloud-Hosted FPGAs
Alon Amid, Albert J. Ou, K. Asanović
et al.
The design of computing systems has changed dramatically over the past decade, but most courses in advanced computer architecture remain unchanged. Computer architecture education lies at the intersection between computer science and electrical engineering, with practical exercises in classes based on appropriate levels of abstraction in the computing system design stack. Hardware-centric lab exercises often require broad infrastructure resources and tend to navigate around tedious practical implementation concepts, while software-centric exercises leave a gap between modeling and system implementation implications that students later need to overcome in professional settings. Vertical integration trends in domain-specific compute systems, as well as software-hardware co-design, are often covered in classroom lectures, but are not reflected in laboratory exercises due to complex tooling and simulation infrastructure. We describe our experiences with a joint hardware-software approach to exploring computer architecture concepts in class exercises, by using open- source processor hardware implementations, generator-based hardware design methodologies, and cloud-hosted FPGAs. This approach further enables scaling course enrollment, remote learning and a cross-class collaborative lab ecosystem, creating a connecting thread between computer science and electrical engineering experience-based curricula.
7 sitasi
en
Computer Science
Novel Computer Architectures and Quantum Chemistry.
M. Gordon, G. Barca, Sarom S. Leang
et al.
Electronic structure theory (especially quantum chemistry) has thrived and has become increasingly relevant to a broad spectrum of scientific endeavors as the sophistication of both computer architectures and software engineering has advanced. This article provides a brief history of advances in both hardware and software, from the early days of IBM mainframes to the current emphasis on accelerators and modern programming practices.
36 sitasi
en
Medicine, Chemistry
Multi‐modal broad learning for material recognition
Zhaoxin Wang, Huaping Liu, Xinying Xu
et al.
Abstract Material recognition plays an important role in the interaction between robots and the external environment. For example, household service robots need to replace humans in the home environment to complete housework, so they need to interact with daily necessities and obtain their material performance. Images provide rich visual information about objects; however, it is often difficult to apply when objects are not visually distinct. In addition, tactile signals can be used to capture multiple characteristics of objects, such as texture, roughness, softness, and friction, which provides another crucial way for perception. How to effectively integrate multi‐modal information is an urgent problem to be addressed. Therefore, a multi‐modal material recognition framework CFBRL‐KCCA for target recognition tasks is proposed in the paper. The preliminary features of each model are extracted by cascading broad learning, which is combined with the kernel canonical correlation learning, considering the differences among different models of heterogeneous data. Finally, the open dataset of household objects is evaluated. The results demonstrate that the proposed fusion algorithm provides an effective strategy for material recognition.
Computer engineering. Computer hardware, Computer applications to medicine. Medical informatics
Conceptual Modeling for Computer Organization and Architecture
Sabah Al-Fedaghi
Understanding computer system hardware, including how computers operate, is essential for undergraduate students in computer engineering and science. Literature shows students learning computer organization and assembly language often find fundamental concepts difficult to comprehend within the topic materials. Tools have been introduced to improve students comprehension of the interaction between computer architecture, assembly language, and the operating system. One such tool is the Little Man Computer (LMC) model that operates in a way similar to a computer but that is easier to understand. Even though LMC does not have modern CPUs with multiple cores nor executes multiple instructions, it nevertheless shows the basic principles of the von Neumann architecture. LMC aims to introduce students to such concepts as code and instruction sets. In this paper, LMC is used for an additional purpose: a tool with which to experiment using a new modeling language (i.e., a thinging machine; TM) in the area of computer organization and architecture without involving complexity in the subject. That is, the simplicity of LMC facilitates the application of TM without going deep into computer organization/architecture materials. Accordingly, the paper (a) provides a new way for using the LMC model for whatever purpose (e.g., education) and (b) demonstrates that TM can be used to build an abstract level of description in the organization/architect field. The resultant schematics from the TM model of LMC offer an initial case study that supports our thesis that TM is a viable method for hardware/software-independent descriptions in the computer organization and architect field of study.
An analytical framework for high-speed hardware particle swarm optimization
I. Damaj, Mohammed El-Shafei, Mohammed El-Abd
et al.
Abstract Engineering optimization techniques are computationally intensive and can challenge implementations on tightly-constrained embedded systems. Particle Swarm Optimization (PSO) is a well-known bio-inspired algorithm that is adopted in various applications, such as, transportation, robotics, energy, etc. In this paper, a high-speed PSO hardware processor is developed with focus on outperforming similar state-of-the-art implementations. In addition, the investigation comprises the development of an analytical framework that captures wide characteristics of optimization algorithm implementations, in hardware and software, using key simple and combined heterogeneous indicators. The framework proposes a combined Optimization Fitness Indicator that can classify the performance of PSO implementations when targeting different evaluation functions. The two targeted processing systems are Field Programmable Gate Arrays for hardware implementations and a high-end multi-core computer for software implementations. The investigation confirms the successful development of a PSO processor with appealing performance characteristics that outperforms recently presented implementations. The proposed hardware implementation attains 23,300 improvement ratio of execution times with an elliptic evaluation function. In addition, a speedup of 1777 times is achieved with a Shifted Schwefels function. Indeed, the developed framework successfully classifies PSO implementations according to multiple and heterogeneous properties for a variety of benchmark functions.
19 sitasi
en
Computer Science
Performance Evaluation for Moving Target Tracking Algorithm Based on Orthogonal Test
XI Runping, XUE Shaohui
The performance evaluation of existing moving target tracking algorithms has many drawbacks,such as massive amount of data,redundant tests and insufficient consideration on algorithm performance under multifactor situation.Therefore,this paper proposes a performance evaluation method for moving target tracking algorithm based on orthogonal test.After a full analysis on the factors and levels that affect algorithm performance,the dataset of orthogonal test is built and then used for algorithm performance test.The data results are analyzed by the range analysis method,so as to obtain the relationship between the factors that affect the algorithm,as well as the combination of factor levels when the algorithm performance is good.Experimental results show that the proposed method can evaluate the performance of the moving target tracking algorithm in a comprehensive and effective way.Besides,this method can reduce the number of tests and data volume,and provide reference for the performance evaluation of other image processing algorithms.
Computer engineering. Computer hardware, Computer software