Hasil untuk "cs.DC"

Menampilkan 20 dari ~251913 hasil · dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2026
Privacy-Preserving Coding Schemes for Multi-Access Distributed Computing Models

Shanuja Sasi

Distributed computing frameworks such as MapReduce have become essential for large-scale data processing by decomposing tasks across multiple nodes. The multi-access distributed computing (MADC) model further advances this paradigm by decoupling mapper and reducer roles: dedicated mapper nodes store data and compute intermediate values, while reducer nodes are connected to multiple mappers and aggregate results to compute final outputs. This separation reduces communication bottlenecks without requiring file replication. In this paper, we introduce privacy constraints into MADC and develop private coded schemes for two specific connectivity models. We construct new families of extended placement delivery arrays and derive corresponding coding schemes that guarantee privacy of each reducer's assigned function.

en cs.DC
arXiv Open Access 2026
High-Performance Portable GPU Primitives for Arbitrary Types and Operators in Julia

Emmanuel Pilliat

Portable GPU frameworks such as Kokkos and RAJA reduce the burden of cross-architecture development but typically incur measurable overhead on fundamental parallel primitives relative to vendor-optimized libraries. We present KernelForge.jl, a Julia library that implements scan, mapreduce, and matrix-vector primitives through a two-layer portable architecture: KernelIntrinsics.jl provides backend-agnostic abstractions for warp-level shuffles, memory fences, and vectorized memory access, while KernelForge.jl builds high-performance algorithms exclusively on top of these interfaces. Evaluated on an NVIDIA A40 and an AMD MI300X, KernelForge.jl matches or exceeds CUB kernel execution time on scan and mapreduce on the A40, and matches cuBLAS throughput on matrix-vector operations across most tested configurations-demonstrating, as a proof of concept, that portable JIT-compiled abstractions can achieve vendor-level throughput without sacrificing generality.

en cs.DC, cs.PF
arXiv Open Access 2026
NBI-Slurm: Simplified submission of Slurm jobs with energy saving mode

Andrea Telatin

NBI-Slurm is a Perl package that provides a simplified, user-friendly interface for submitting and managing jobs on SLURM high-performance computing (HPC) clusters. It offers both a library of Perl modules for programmatic job management and a suite of command-line tools designed to reduce the cognitive overhead of SLURM's native interface. Distinctive features of NBI-Slurm are (a) TUI applications to view and cancel jobs, (b) the possibility to generate tool-specific wrappers for (bioinformatic) tools and (c) an energy-aware scheduling mode -- "eco mode" -- that automatically defers flexible jobs to off-peak periods, helping research institutions reduce their computational carbon footprint without requiring users to manually plan submission times.

en cs.DC
arXiv Open Access 2025
Counterfactual simulations for large scale systems with burnout variables

Benjamin Heymann

We consider large-scale systems influenced by burnout variables - state variables that start active, shape dynamics, and irreversibly deactivate once certain conditions are met. Simulating what-if scenarios in such systems is computationally demanding, as alternative trajectories often require sequential processing, which does not scale very well. This challenge arises in settings like online advertising, because of campaigns budgets, complicating counterfactual analysis despite rich data availability. We introduce a new type of algorithms based on what we refer to as uncertainty relaxation, that enables efficient parallel computation, significantly improving scalability for counterfactual estimation in systems with burnout variables.

en cs.DC, math.OC
arXiv Open Access 2024
Floating Point Compression of Hierarchical Matrix Formats and its Impact on Matrix-Vector Multiplication

Ronald Kriemann

Matrix-vector multiplication forms the basis of many iterative solution algorithms and as such is an important algorithm also for hierarchical matrices which are used to represent dense data in an optimized form by applying low-rank compression. However, due to its low computational intensity, the performance of matrix-vector multiplication is typically limited by the available memory bandwidth on parallel systems. With floating point compression the memory footprint can be optimized, which reduces the stress on the memory sub system and thereby increases performance. We will look into the compression of different formats of hierachical matrices and how this can be used to speed up the corresponding matrix-vector multiplication.

en cs.DC, cs.MS
arXiv Open Access 2020
A Simple and Efficient Asynchronous Randomized Binary Byzantine Consensus Algorithm

Tyler Crain

This paper describes a simple and efficient asynchronous Binary Byzantine faulty tolerant consensus algorithm. In the algorithm, non-faulty nodes perform an initial broadcast followed by a executing a series of rounds each consisting of a single message broadcast plus the computation of a global random coin using threshold signatures. Each message is accompanied by a cryptographic proof of its validity. Up to one third of the nodes can be faulty and termination is expected in a constant number of rounds. An optimization is described allowing the round message plus the coin message to be combined, reducing rounds to a single message delay. Geodistributed experiments are run on replicates in ten data center regions showing average latencies as low as 400 milliseconds.

en cs.DC
arXiv Open Access 2019
Improving In-Network Computing in IoT Through Degeneracy

Merim Dzaferagic, Neal McBride, Ryan Thomas et al.

We present a novel way of considering in-network computing (INC), using ideas from statistical physics. We define degeneracy for INC as the multiplicity of possible options available within the network to perform the same function with a given macroscopic property (e.g. delay). We present an efficient algorithm to determine all these alternatives. Our results show that by exploiting the set of possible degenerate alternatives, we can significantly improve the successful computation rate of a symmetric function, while still being able to satisfy requirements such as delay or energy consumption.

en cs.DC
arXiv Open Access 2019
SVE-enabling Lattice QCD Codes

Nils Meyer, Peter Georg, Dirk Pleiter et al.

Optimization of applications for supercomputers of the highest performance class requires parallelization at multiple levels using different techniques. In this contribution we focus on parallelization of particle physics simulations through vector instructions. With the advent of the Scalable Vector Extension (SVE) ISA, future ARM-based processors are expected to provide a significant level of parallelism at this level.

en cs.DC, hep-lat
CrossRef Open Access 2018
Voltage Control for DC-DC Converters

Usman Rahat, Abdul Basit, Muhammad Salman

In this paper, we discuss voltage control method for buck converter operating in continuous conduction mode (CCM) using analog feedback system. The aim of this work is to control the output voltage of a buck converter during the variation in load current. This is obtained using analog feedback made with operational amplifier (Opamp). However, the same technique can be applied to other DC-DC converters (e.g boost, buck-boost, cuk converter, etc) in CCM mode, but for the purpose of analysis buck converter is chosen as an example.

arXiv Open Access 2018
The application of precision time protocol on EAST timing system

Z. Zhang, B. Xiao, Z. Ji et al.

The timing system focuses on synchronizing and coordinating each subsystem according to the trigger signals. A new prototype timing slave node based on precision time protocol has been developed by using ARM STM32 platform. The proposed slave timing module is tested and results show that the synchronization accuracy between slave nodes is in sub-microsecond range.

en cs.DC, cs.NI
arXiv Open Access 2018
A Big Data Architecture for Log Data Storage and Analysis

Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia

We propose an architecture for analysing database connection logs across different instances of databases within an intranet comprising over 10,000 users and associated devices. Our system uses Flume agents to send notifications to a Hadoop Distributed File System for long-term storage and ElasticSearch and Kibana for short-term visualisation, effectively creating a data lake for the extraction of log data. We adopt machine learning models with an ensemble of approaches to filter and process the indicators within the data and aim to predict anomalies or outliers using feature vectors built from this log data.

en cs.DC, cs.LG
arXiv Open Access 2017
Oseba: Optimization for Selective Bulk Analysis in Big Data Processing

Rui Wang, Jun Wang

Selective bulk analyses, such as statistical learning on temporal/spatial data, are fundamental to a wide range of contemporary data analysis. However, with the increasingly larger data-sets, such as weather data and marketing transactions, the data organization/access becomes more challenging in selective bulk data processing with the use of current big data processing frameworks such as Spark or keyvalue stores. In this paper, we propose a method to optimize selective bulk analysis in big data processing and referred to as Oseba. Oseba maintains a super index for the data organization in memory to support fast lookup through targeting the data involved with each selective analysis program. Oseba is able to save memory as well as computation in comparison to the default data processing frameworks.

en cs.DC
arXiv Open Access 2017
Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product

Grey Ballard, Nicholas Knight, Kathryn Rouse

The matricized-tensor times Khatri-Rao product computation is the typical bottleneck in algorithms for computing a CP decomposition of a tensor. In order to develop high performance sequential and parallel algorithms, we establish communication lower bounds that identify how much data movement is required for this computation in the case of dense tensors. We also present sequential and parallel algorithms that attain the lower bounds and are therefore communication optimal. In particular, we show that the structure of the computation allows for less communication than the straightforward approach of casting the computation as a matrix multiplication operation.

en cs.DC
CrossRef Open Access 2016
Perspectivas de la ciencia política en Colombia, diálogos y diatribas epistemológicas

Julian Andres Cuellar, Camila Sanchez Sandoval, Sergio Alfonso Huertas

El artículo presentado a continuación tiene por objetivo generar una reflexión alrededor de los postulados epistémicos de los trabajos adelantados por (Cuellar Argote, 2007) y (Leyva & Ramirez, 2015), con el ánimo de apostar por una actualización de las premisas y avances que estos autores otorgaron a la discusión sobre la Ciencia Política en Colombia y su proceso de consolidación. Así, el artículo hace uso de la metodología de análisis documental y su desarrollo evidencia algunos intereses compartidos en los autores alrededor de la enseñanza/formación de la Ciencia Política en Colombia.

arXiv Open Access 2016
Self-Stabilizing Maximal Matching and Anonymous Networks

Johanne Cohen, Jonas Lefèvre, Khaled Maâmra et al.

We propose a self-stabilizing algorithm for computing a maximal matching in an anonymous network. The complexity is $O(n^3)$ moves with high probability, under the adversarial distributed daemon. In this algorithm, each node can determine whether one of its neighbors points to it or to another node, leading to a contradiction with the anonymous assumption. To solve this problem, we provide under the classical link-register model, a self-stabilizing algorithm that gives a unique name to a link such that this name is shared by both extremities of the link.

en cs.DC
arXiv Open Access 2016
The Effect of Multi-core Communication Architecture on System Performance

Bilal Habib, Ahmed Anber, Sultan Daud Khan

MPSoCs are gaining popularity because of its potential to solve computationally expensive applications. A multi-core processor combines two or more independent cores (normally a CPU) into a single package composed of a single integrated circuit (Chip). However, as the number of components on a single chip and their performance continue to increase, a shift from computation-based to communication-based design becomes mandatory. As a result, the communication architecture plays a major role in the area, performance, and energy consumption of the overall system. In this paper, multiple soft-cores (IPs) such as Micro Blaze in an FPGA is used to study the effect of different connection topologies on the performance of a parallel program.

en cs.DC
arXiv Open Access 2015
A GA based approach for task scheduling in multi-cloud environment

Tripti Tanaya Tejaswi, Md Azharuddin, P. K. Jana

In multi-cloud environment, task scheduling has attracted a lot of attention due to NP-Complete nature of the problem. Moreover, it is very challenging due to heterogeneity of the cloud resources with varying capacities and functionalities. Therefore, minimizing the makespan for task scheduling is a challenging issue. In this paper, we propose a genetic algorithm (GA) based approach for solving task scheduling problem. The algorithm is described with innovative idea of fitness function derivation and mutation. The proposed algorithm is exposed to rigorous testing using various benchmark datasets and its performance is evaluated in terms of total makespan.

en cs.DC
arXiv Open Access 2014
Implementation of an efficient RBAC in Cloud Computing using .NET environment

Ruhi Gupta

Cloud Computing is flourishing day by day and it will continue in developing phase until computers and internet era is in existence. While dealing with cloud computing, a number of security and traffic related issues are confronted. Load Balancing is one of the answers to these issues. RBAC deals with such an answer. The proposed technique involves the hybrid of FCFS with RBAC technique. RBAC will assign roles to the clients and clients with a particular role can only access the particular document. Hence identity management and access management are fully implemented using this technique.

Halaman 40 dari 12596