Hasil untuk "cs.DC"

Menampilkan 19 dari ~251348 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar

JSON API
arXiv Open Access 2026
TokenDance: Scaling Multi-Agent LLM Serving via Collective KV Cache Sharing

Zhuohang Bian, Feiyang Wu, Chengrui Zhang et al.

Multi-agent LLM applications organize execution in synchronized rounds where a central scheduler gathers outputs from all agents and redistributes the combined context. This All-Gather communication pattern creates massive KV Cache redundancy, because every agent's prompt contains the same shared output blocks, yet existing reuse methods fail to exploit it efficiently. We present TokenDance, a system that scales the number of concurrent agents by exploiting the All-Gather pattern for collective KV Cache sharing. TokenDance's KV Collector performs KV Cache reuse over the full round in one collective step, so the cost of reusing a shared block is paid once regardless of agent count. Its Diff-Aware Storage encodes sibling caches as block-sparse diffs against a single master copy, achieving 11-17x compression on representative workloads. Evaluation on GenerativeAgents and AgentSociety shows that TokenDance supports up to 2.7x more concurrent agents than vLLM with prefix caching under SLO requirement, reduces per-agent KV Cache storage by up to 17.5x, and achieves up to 1.9x prefill speedup over per-request position-independent caching.

en cs.DC
arXiv Open Access 2025
Virtual Garbage Collector (VGC): A Zone-Based Garbage Collection Architecture for Python's Parallel Runtime

Abdulla M

The Virtual Garbage Collector (VGC) proposes a zone-based memory management architecture aimed at improving execution predictability and memory behavior in Python runtimes. The design explores a dual-layer model consisting of an Active VGC, responsible for managing runtime object lifecycles, and a Passive VGC, intended as a compile-time optimization layer for static allocation planning. Rather than relying on traditional heap traversal or generational heuristics, VGC introduces memory zoning and checkpoint-based state evaluation to reduce allocation churn and constrain garbage collection scope. Execution partitioning is experimentally evaluated to isolate workloads and localize memory pressure, enabling more deterministic behavior under loop-intensive, recursive, and compute-heavy workloads. This work presents the architectural principles, execution model, and experimental observations of VGC within a partition-aware runtime context. While the full realization of the dual-layer design is an ongoing effort, the results indicate that zone-based allocation and partitioned execution provide a viable foundation for improving scalability and memory predictability in Python-oriented systems.

en cs.PL, cs.DC
arXiv Open Access 2023
On the Mechanics of NFT Valuation: AI Ethics and Social Media

Luyao Zhang, Yutong Sun, Yutong Quan et al.

As CryptoPunks pioneers the innovation of non-fungible tokens (NFTs) in AI and art, the valuation mechanics of NFTs has become a trending topic. Earlier research identifies the impact of ethics and society on the price prediction of CryptoPunks. Since the booming year of the NFT market in 2021, the discussion of CryptoPunks has propagated on social media. Still, existing literature hasn't considered the social sentiment factors after the historical turning point on NFT valuation. In this paper, we study how sentiments in social media, together with gender and skin tone, contribute to NFT valuations by an empirical analysis of social media, blockchain, and crypto exchange data. We evidence social sentiments as a significant contributor to the price prediction of CryptoPunks. Furthermore, we document structure changes in the valuation mechanics before and after 2021. Although people's attitudes towards Cryptopunks are primarily positive, our findings reflect imbalances in transaction activities and pricing based on gender and skin tone. Our result is consistent and robust, controlling for the rarity of an NFT based on the set of human-readable attributes, including gender and skin tone. Our research contributes to the interdisciplinary study at the intersection of AI, Ethics, and Society, focusing on the ecosystem of decentralized AI or blockchain. We provide our data and code for replicability as open access on GitHub.

en cs.CY, cs.DC
arXiv Open Access 2023
OpenMP behavior in low resource and high stress mobile environment

Kaijun Zhang

This paper investigates the use of OpenMP for parallel post processing in obejct detection on personal Android devices, where resources like computational power, memory, and battery are limited. Specifically, it explores various configurations of thread count, CPU affinity, and chunk size on a Redmi Note 10 Pro with an ARM Cortex A76 CPU. The study finds that using four threads offers a maximum post processing speedup of 2.3x but increases overall inference time by 2.7x. A balanced configuration of two threads achieves a 1.8x speedup in post processing and a 2% improvement in overall program performance.

en cs.PF
arXiv Open Access 2023
Metaverse: A Vision, Architectural Elements, and Future Directions for Scalable and Realtime Virtual Worlds

Leila Ismail, Rajkumar Buyya

With the emergence of Cloud computing, Internet of Things-enabled Human-Computer Interfaces, Generative Artificial Intelligence, and high-accurate Machine and Deep-learning recognition and predictive models, along with the Post Covid-19 proliferation of social networking, and remote communications, the Metaverse gained a lot of popularity. Metaverse has the prospective to extend the physical world using virtual and augmented reality so the users can interact seamlessly with the real and virtual worlds using avatars and holograms. It has the potential to impact people in the way they interact on social media, collaborate in their work, perform marketing and business, teach, learn, and even access personalized healthcare. Several works in the literature examine Metaverse in terms of hardware wearable devices, and virtual reality gaming applications. However, the requirements of realizing the Metaverse in realtime and at a large-scale need yet to be examined for the technology to be usable. To address this limitation, this paper presents the temporal evolution of Metaverse definitions and captures its evolving requirements. Consequently, we provide insights into Metaverse requirements. In addition to enabling technologies, we lay out architectural elements for scalable, reliable, and efficient Metaverse systems, and a classification of existing Metaverse applications along with proposing required future research directions.

en cs.HC, cs.AI
S2 Open Access 2018
Revisiting Fast Practical Byzantine Fault Tolerance: Thelma, Velma, and Zelma

Ittai Abraham, Guy Golan-Gueta, D. Malkhi et al.

In a previous note (arXiv:1712.01367 [cs.DC]) , we observed a safety violation in Zyzzyva and a liveness violation in FaB. In this manuscript, we sketch fixes to both. The same view-change core is applied in the two schemes, and additionally, applied to combine them and create a single, enhanced scheme that has the benefits of both approaches.

38 sitasi en Computer Science

Halaman 1 dari 12568