Nataliia Kussul, M. Lavreniuk, S. Skakun et al.
Hasil untuk "Architecture"
Menampilkan 20 dari ~2885907 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar
Wilson Yan, Yunzhi Zhang, P. Abbeel et al.
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos. VideoGPT uses VQ-VAE that learns downsampled discrete latent representations of a raw video by employing 3D convolutions and axial self-attention. A simple GPT-like architecture is then used to autoregressively model the discrete latents using spatio-temporal position encodings. Despite the simplicity in formulation and ease of training, our architecture is able to generate samples competitive with state-of-the-art GAN models for video generation on the BAIR Robot dataset, and generate high fidelity natural videos from UCF-101 and Tumbler GIF Dataset (TGIF). We hope our proposed architecture serves as a reproducible reference for a minimalistic implementation of transformer based video generation models. Samples and code are available at https://wilson1yan.github.io/videogpt/index.html
S. Kamara, K. Lauter
Dominik Scherer, Andreas C. Müller, Sven Behnke
A. Parashar, Priyanka Raina, Y. Shao et al.
This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture design space of deep neural network (DNN) accelerators. Timeloop uses a concise and unified representation of the key architecture and implementation attributes of DNN accelerators to describe a broad space of hardware topologies. It can then emulate those topologies to generate an accurate projection of performance and energy efficiency for a DNN workload through a mapper that finds the best way to schedule operations and stage data on the specified architecture. This enables fair comparisons across different architectures and makes DNN accelerator design more systematic. This paper describes Timeloop's underlying models and algorithms in detail and shows results from case studies enabled by Timeloop, which provide interesting insights into the current state of DNN architecture design. In particular, they reveal that dataflow and memory hierarchy co-design plays a critical role in optimizing energy efficiency. Also, there is currently still not a single architecture that achieves the best performance and energy efficiency across a diverse set of workloads due to flexibility and efficiency trade-offs. These results provide inspiration into possible directions for DNN accelerator research.
C. Lin, M. Gerla
Robert Allen, D. Garlan
Wan-Ju Li, C. Laurencin, E. Caterson et al.
Richard Platt
H. T. Kung
J. Bosch
B. Givoni
Lucas S. Lopes, Ricardo L. de Queiroz
The performance of neural image coders is heavily dependent on their architecture and, hence, on the selection of hyperparameters. Such performance, for a given architecture, is often ascertained by trial, that is, after training and inference, so that many trials may be conducted to select the hyperparameters. We propose a multi-objective hyperparameter optimization (MOHPO) method for neural image compression based on rate-distortion-complexity (RDC) analysis, which drastically reduces the number of networks to try (train and test), thereby saving resources. We validate it on well-established benchmark problems and demonstrate its use with popular autoencoders, measuring their complexities in terms of the number of parameters and floating-point operations. Our method, which we refer to as the greedy lower convex hull (GLCH), aims to track the lower convex hull of a cloud of hyperparameter possibilities. We compare our method with other well-established state-of-the-art MOHPO methods in terms of log-hypervolume difference as a function of the number of trained networks. The results indicate that the proposed method is highly competitive, particularly with fewer trained networks, which is a critical scenario in practice. Furthermore, it is deterministic, that is, it remains consistent across different runs.
Noam Levy
Dynamic sparse attention (DSA) reduces the per-token attention bandwidth by restricting computation to a top-k subset of cached key-value (KV) entries, but its token-dependent selection pattern introduces a system-level challenge: the KV working set is fragmented, volatile, and difficult to prefetch, which can translate into poor cache locality and stalled decode throughput. We study these effects by implementing a lightweight indexer for DSA-style selection on multiple open-source backbones and logging per-layer KV indices during autoregressive decoding. Our analysis shows a gap in serving DSA backbones - a potential for a high volume of blocking LL (last level) cache miss events, causing inefficiency; we propose a novel LL cache reservation system to save KV tokens in the LL cache between decode steps, combined with a token-granularity LRU eviction policy, and show on the data we collected how this architecture can benefit serving with DSA implemented on different backbones. Finally, we propose directions for future architectural and algorithmic exploration to improve serving of DSA on modern inference platforms.
Naoya Onizawa, Taiga Kubuta, Duckgyu Shin et al.
Probabilistic bits (p-bits) offer an energy-efficient hardware abstraction for stochastic optimization; however, existing p-bit-based simulated annealing accelerators suffer from poor scalability and limited support for fully connected graphs due to fan-out and memory overhead. This paper presents an energy-efficient FPGA architecture for stochastic simulated quantum annealing (SSQA) that addresses these challenges. The proposed design combines a spin-serial and replica-parallel update schedule with a dual-BRAM delay-line architecture, enabling scalable support for fully connected Ising models while eliminating fan-out growth in logic resources. By exploiting SSQA, the architecture achieves fast convergence using only final replica states, significantly reducing memory requirements compared to conventional p-bit-based annealers. Implemented on a Xilinx ZC706 FPGA, the proposed system solves an 800-node MAX-CUT benchmark and achieves up to 50% reduction in energy consumption and over 90\% reduction in logic resources compared with prior FPGA-based p-bit annealing architectures. These results demonstrate the practicality of quantum-inspired, p-bit-based annealing hardware for large-scale combinatorial optimization under strict energy and resource constraints.
D. Heimbigner, D. McLeod
Lin Zheng, Jinlong Li, Zhanbo Zhu et al.
Abstract In recent years, with the popularization of online education, real-time monitoring of learning engagement has become a key challenge for scholars. Existing studies mainly rely on questionnaires and physiological signal detection, which have limitations such as high subjectivity, poor real-time performance, and expensive equipment. Previous research has shown that head pose is closely related to cognitive state. However, current estimation models require substantial computational resources, making real-time deployment on mobile devices challenging. In this study, we validate the significant correlation between head pose and learning engagement based on the DAiSEE dataset (8,925 video clips) and propose a lightweight head pose estimation method. The LightNet proposed in this paper uses an improved feature extraction module (MG-Net) and an Attention-based multi-scale fusion model (AMF). Experiments conducted on the 300W-LP and BIWI benchmark datasets demonstrate that, compared with existing state-of-the-art methods, LightNet substantially reduces model complexity by decreasing the number of parameters to just 0.45 $$\times 10^6$$ × 10 6 , representing over 90% reduction in model size. Despite this significant compression, LightNet maintains a high level of accuracy, with the mean absolute error (MAE) increasing by only 0.15°, indicating a minimal loss in prediction precision. Moreover, the model achieves a notable improvement in processing speed, exceeding 50% increase relative to baseline approaches. This combination of a lightweight architecture, competitive accuracy, and accelerated inference speed underscores LightNet’s effectiveness and its potential suitability for real-time applications. This study not only expands the application of head pose in education but also provides a feasible solution for real-time engagement monitoring on resource-constrained devices.
Opeyemi Bamigbade, Mark Scanlon, John Sheppard
Embeddings remain the best way to represent image features, but do not always capture all latent information. This is still a problem in representation learning, and computer vision descriptors struggle with precision and accuracy. Improving image embedding with other features is necessary for tasks like image geolocation, especially for indoor scenes where descriptive cues can have less distinctive characteristics. This work proposes a model architecture that integrates image N-dominant colours and colour histogram vectors in different colour spaces with image embedding from deep metric learning and classification perspectives. The results indicate that the integration of colour features improves image embedding, surpassing the performance of using embedding alone. In addition, the classification approach yields higher accuracy compared to deep metric learning methods. Interestingly, different saturation points were observed for image colour-improved embedding features in models and colour spaces. These findings have implications for the design of more robust image geolocation systems, particularly in indoor environments.
Wei Wang, Mingkang Cao, Zhigang Wu et al.
Under the new normal of China’s development, urban construction has shifted from incremental expansion to the optimization of existing stock. As the focal point of urban stock, old communities have garnered increasingly in-depth research. Recent studies have extended their perspectives from physical spaces to the interactive relationship between “space and behavior”, while also emphasizing the integration of qualitative and quantitative analyses. However, existing research primarily focuses on the static characteristics of material spatial environments, neglecting the dynamic interplay between spatial attributes and social network relationships. This study takes the Cangxia Community in Fuzhou as a case study, employing social network analysis (SNA) to construct a dual-network model of resident behavior and public space. Through a three-level analysis of “overall–subgroup–single point”, the intrinsic relationship between “space and behavior” in old communities is revealed. The model demonstrates that resident behavior characteristics are positively correlated with public space attributes, namely, the better the spatial accessibility and visibility, the higher the frequency of resident behaviors. However, mismatched spatial nodes also exist, limiting the synergistic optimization of the dual-network model. This research aims to provide scientifically effective methods and paradigms for the renewal of old communities and the sustainable development of cities.
Abraham Akinyemi, Olumide Ajani, Olumide Akinniyi
Monosodium glutamate (MSG) is a widespread flavour enhancer linked to health risks, including male reproductive dysfunction. This study investigated tiger nut (Cyperus esculentus) as a potential protective agent against MSG-induced reproductive issues in male Wistar rats. Forty adult rats were divided into four groups: control, MSG-only (2 mg/g), tiger nut-only (500 mg/kg), and MSG+tiger nut combination (2 mg/g MSG + 500 mg/kg tiger nut). Treatments were administered orally for 28 days, with analyses conducted at days 14 and 28. Results showed significant variations in sperm parameters. At 14 days, the tiger nut group showed highest sperm motility (88.60±4.04%) and count (100.60±3.21×106/mL), while MSG reduced sperm viability (70.00±4.69%). By 28 days, MSG significantly decreased sperm motility (41.80±4.92%) and viability (54.80±6.76%). MSG increased sperm abnormalities at 14 days (13.60±2.51%) but normalized by 28 days. The MSG+tiger nut combination eliminated certain sperm abnormalities like coiled tail and tail-without-head. Gonadometric parameters remained stable throughout the study, indicating tiger nut's ability to maintain testicular architecture despite MSG exposure. Initial body weight increases in the MSG group normalized by weeks 3-4. The study concludes that tiger nut juice significantly protects against MSG-induced low sperm quality in male Wistar rats, suggesting its potential as a protective supplement for populations with unavoidable MSG exposure. Future research should explore long-term effects and cellular mechanisms.
Halaman 31 dari 144296