Hasil untuk "cs.DC"

Menampilkan 20 dari ~251859 hasil Β· dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2025
A Proposed End-To-End Principle for Data Commons

Robert L. Grossman

A data commons brings together (or co-locates) data with cloud computing infrastructure and commonly used software services, tools and applications for managing, analyzing and sharing data to create an interoperable resource for a research community. We introduce an architectural design principle for data commons called the narrow middle architecture that is broadly based upon the end-to-end argument in systems design. We also discuss important core services for data commons and the role of standards.

en cs.DC
arXiv Open Access 2025
SmartShards: Churn-Tolerant Continuously Available Distributed Ledger

Joseph Oglio, Mikhail Nesterenko, Gokarna Sharma

We present SmartShards: a new sharding algorithm for improving Byzantine tolerance and churn resistance in blockchains. Our algorithm places a peer in multiple shards to create an overlap. This simplifies cross-shard communication and shard membership management. We describe SmartShards, prove it correct and evaluate its performance. We propose several SmartShards extensions: defense against a slowly adaptive adversary, combining transactions into blocks, fortification against the join/leave attack.

en cs.DC
arXiv Open Access 2025
PAT: a new algorithm for all-gather and reduce-scatter operations at scale

Sylvain Jeaugey

This paper describes a new algorithm called PAT, for Parallel Aggregated Trees, and which can be used to implement all-gather and reduce-scatter operations. This algorithm works on any number of ranks, has a logarithmic number of network transfers for small size operations, minimizes long-distance communication, and requires a logarithmic amount of internal buffers, independently from the total operation size. It is aimed at improving the performance of the NCCL library in cases where the ring algorithm would be inefficient, as its linear latency would show poor performance for small sizes and/or at scale.

en cs.DC
arXiv Open Access 2020
Author's approach to the topological modeling of parallel computing systems

Victor A. Melent'ev

The author's research of topologies of parallel computing systems and the tasks solved with them, including the corresponding tools of their modeling, is summarized in the present paper. The original topological model of such systems is presented based on the modified Amdahl law. It allowed formalizing the dependence of the necessary number of processors and the maximal distance between information-adjacent vertices in a graph on the directive values of acceleration or efficiency. The dependences of these values on the system interconnection topology and on the information graph of the parallel task are also formalized. The tools for a comparative evaluation of these dependences, topological criteria and the functions of scaling and fault-tolerant operation of parallel systems are based on the author|s technique of projective description of graphs and the algorithms used in it.

arXiv Open Access 2020
An Easy-to-Use-and-Deploy Grid Computing Framework

Gaurav Menghani, Anil Harwani, Yash Londhe et al.

A few grid-computing tools are available for public use. However, such systems are usually quite complex and require several man-months to set up. In case the user wishes to set-up an ad-hoc grid in a small span of time, such tools cannot be used. Moreover, the complex services they provide, like, reliable file transfer, extra layers of security etc., act as an overhead to performance in case the network is small and reliable. In this paper we describe the structure of our grid-computing framework, which can be implemented and used, easily on a moderate sized network.

en cs.DC
arXiv Open Access 2020
Study of Automatic GPU Offloading Method from Various Language Applications

Yoji Yamato

In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as CUDA are high. Based on that, I have proposed environment-adaptive software that enables automatic conversion, configuration, and high performance operation of once written code, according to the hardware to be placed. However, the source language for offloading was mainly C/C++ language applications currently, and there was no research for common offloading for various language applications. In this paper, I study a common method for automatically offloading for various language applications not only in C language but also in Python and Java.

en cs.DC
arXiv Open Access 2020
Elastic execution of checkpointed MPI applications

Sumeet Gajjar, Saurabh Vaidya

MPI applications begin with a fixed number of rank and, by default, the rank remains constant throughout the application's lifetime. The developer can choose to increase the rank by dynamically spawning MPI processes. However doing this manually adds complexity to the MPI application. Making the MPI applications malleable \cite{b20} would allow HPC applications to have the same elasticity as that of cloud applications. We propose multiple approaches to change the rank of an MPI program agnostic to the modification of the user code. We use checkpointing as a tool to achieve mutability of rank by halting the execution and resuming the MPI program with a new state. In this paper, we focus on the scenario of increasing the rank of an MPI program using ExaMPI as the implementation for MPI.

en cs.DC
arXiv Open Access 2019
Flat combined Red Black Trees

Sergio Sainz-Palacios

Flat combining is a concurrency threaded technique whereby one thread performs all the operations in batch by scanning a queue of operations to-be-done and performing them together. Flat combining makes sense as long as k operations each taking O(n) separately can be batched together and done in less than O(k*n). Red black tree is a balanced binary search tree with permanent balancing warranties. Operations in red black tree are hard to batch together: for example inserting nodes in two different branches of the tree affect different areas of the tree. In this paper we investigate alternatives to making a flat combine approach work for red black trees.

en cs.DC, cs.DB
arXiv Open Access 2019
Efficient Lock-Free Durable Sets

Yoav Zuriel, Michal Friedman, Gali Sheffi et al.

Non-volatile memory is expected to co-exist or replace DRAM in upcoming architectures. Durable concurrent data structures for non-volatile memories are essential building blocks for constructing adequate software for use with these architectures. In this paper, we propose a new approach for durable concurrent sets and use this approach to build the most efficient durable hash tables available today. Evaluation shows a performance improvement factor of up to 3.3x over existing technology.

en cs.DC, cs.DS
arXiv Open Access 2017
Techniques for Constructing Efficient Lock-free Data Structures

Trevor Brown

Building a library of concurrent data structures is an essential way to simplify the difficult task of developing concurrent software. Lock-free data structures, in which processes can help one another to complete operations, offer the following progress guarantee: If processes take infinitely many steps, then infinitely many operations are performed. Handcrafted lock-free data structures can be very efficient, but are notoriously difficult to implement. We introduce numerous tools that support the development of efficient lock-free data structures, and especially trees.

en cs.DC
arXiv Open Access 2016
A Performance Evaluation of Container Technologies on Internet of Things Devices

Roberto Morabito

The use of virtualization technologies in different contexts - such as Cloud Environments, Internet of Things (IoT), Software Defined Networking (SDN) - has rapidly increased during the last years. Among these technologies, container-based solutions own characteristics for deploying distributed and lightweight applications. This paper presents a performance evaluation of container technologies on constrained devices, in this case, on Raspberry Pi. The study shows that, overall, the overhead added by containers is negligible.

en cs.DC, cs.PF
arXiv Open Access 2016
NOP - A Simple Experimental Processor for Parallel Deployment

Oskar Schirmer

The design of a parallel computing system using several thousands or even up to a million processors asks for processing units that are simple and thus small in space, to make as many processing units as possible fit on a single die. The design presented herewith is far from being optimised, it is not meant to compete with industry performance devices. Its main purpose is to allow for a prototypical implementation of a dynamic software system as a proof of concept.

en cs.DC, cs.AR
arXiv Open Access 2016
Horn: A System for Parallel Training and Regularizing of Large-Scale Neural Networks

Edward J. Yoon

I introduce a new distributed system for effective training and regularizing of Large-Scale Neural Networks on distributed computing architectures. The experiments demonstrate the effectiveness of flexible model partitioning and parallelization strategies based on neuron-centric computation model, with an implementation of the collective and parallel dropout neural networks training. Experiments are performed on MNIST handwritten digits classification including results.

en cs.DC, cs.LG
arXiv Open Access 2015
Building a Virtual HPC Cluster with Auto Scaling by the Docker

Hsi-En Yu, Weicheng Huang

Solving the software dependency issue under the HPC environment has always been a difficult task for both computing system administrators and application scientists. This work would like to tackle the issue by introducing the modern container technology, the Docker, to be specific. By integrating the auto-scaling feature of service discovery with the light-weight virtualization tool, the Docker, the construction of a virtual cluster on top of physical cluster hardware is attempted. Thus, through the isolation of computing environment, a remedy of software dependency of HPC environment is possible.

en cs.DC
arXiv Open Access 2014
Survey of Parallel Computing with MATLAB

Zaid Abdi Alkareem Alyasseri

Matlab is one of the most widely used mathematical computing environments in technical computing. It has an interactive environment which provides high performance computing (HPC) procedures and easy to use. Parallel computing with Matlab has been an interested area for scientists of parallel computing researches for a number of years. Where there are many attempts to parallel Matlab. In this paper, we present most of the past,present attempts of parallel Matlab such as MatlabMPI, bcMPI, pMatlab, Star-P and PCT. Finally, we expect the future attempts.

en cs.DC

Halaman 36 dari 12593