Hasil "cs.OS" - JURNALIN

arXiv Open Access 2025

Iridescent: A Framework Enabling Online System Implementation Specialization

Vaastav Anand, Deepak Garg, Antoine Kaufmann

Specializing systems to specifics of the workload they serve and platform they are running on often significantly improves performance. However, specializing systems is difficult in practice because of compounding challenges: i) complexity for the developers to determine and implement optimal specialization; ii) inherent loss of generality of the resulting implementation, and iii) difficulty in identifying and implementing a single optimal specialized configuration for the messy reality of modern systems. To address this, we introduce Iridescent, a framework for automated online system specialization guided by observed overall system performance. Iridescent lets developers specify a space of possible specialization choices, and then at runtime generates and runs different specialization choices through JIT compilation as the system runs. By using overall system performance metrics to guide this search, developers can use Iridescent to find optimal system specializations for the hardware and workload conditions at a given time. We demonstrate feasibility, effectivity, and ease of use.

en cs.OS

Detail Sumber

arXiv Open Access 2025

Proto: A Guided Journey through Modern OS Construction

Wonkyo Choe, Rongxiang Wang, Afsara Benazir et al.

Proto is a new instructional OS that runs on commodity, portable hardware. It showcases modern features, including per-app address spaces, threading, commodity filesystems, USB, DMA, multicore support, self-hosted debugging, and a window manager. It supports rich applications such as 2D/3D games, music and video players, and a blockchain miner. Unlike traditional instructional systems, Proto emphasizes engaging, media-rich apps that go beyond basic terminal programs. Our method breaks down a full-featured OS into a set of incremental, self-contained prototypes. Each prototype introduces a minimal set of OS mechanisms, driven by the needs of specific apps. The construction process then progressively enables these apps by bringing up one mechanism at a time. Proto enables a wider audience to experience building a self-contained software system used in daily life

en cs.OS, cs.SE

Detail DOI Sumber

arXiv Open Access 2025

Crash-Consistent Checkpointing for AI Training on macOS/APFS

Juha Jeon

Deep learning training relies on periodic checkpoints to recover from failures, but unsafe checkpoint installation can leave corrupted files on disk. This paper presents an experimental study of checkpoint installation protocols and integrity validation for AI training on macOS/APFS. We implement three write modes with increasing durability guarantees: unsafe (baseline, no fsync), atomic_nodirsync (file-level durability via fsync()), and atomic_dirsync (file + directory durability). We design a format-agnostic integrity guard using SHA-256 checksums with automatic rollback. Through controlled experiments including crash injection (430 unsafe-mode trials) and corruption injection (1,600 atomic-mode trials), we demonstrate that the integrity guard detects 99.8-100% of corruptions with zero false positives. Performance overhead is 56.5-108.4% for atomic_nodirsync and 84.2-570.6% for atomic_dirsync relative to the unsafe baseline. Our findings quantify the reliability-performance trade-offs and provide deployment guidance for production AI infrastructure.

en cs.OS, cs.LG

Detail Sumber

arXiv Open Access 2024

Next4: Snapshots in Ext4 File System

Aditya Dani, Shardul Mangade, Piyush Nimbalkar et al.

The growing value of data as a strategic asset has given rise to the necessity of implementing reliable backup and recovery solutions in the most efficient and cost-effective manner. The data backup methods available today on linux are not effective enough, because while running, most of them block I/Os to guarantee data integrity. We propose and implement Next4 - file system based snapshot feature in Ext4 which creates an instant image of the file system, to provide incremental versions of data, enabling reliable backup and data recovery. In our design, the snapshot feature is implemented by efficiently infusing the copy-on-write strategy in the write-in-place, extent based Ext4 file system, without affecting its basic structure. Each snapshot is an incremental backup of the data within the system. What distinguishes Next4 is the way that the data is backed up, improving both space utilization as well as performance.

en cs.OS

Detail Sumber

arXiv Open Access 2024

SquirrelFS: using the Rust compiler to check file-system crash consistency

Hayley LeBlanc, Nathan Taylor, James Bornholt et al.

This work introduces a new approach to building crash-safe file systems for persistent memory. We exploit the fact that Rust's typestate pattern allows compile-time enforcement of a specific order of operations. We introduce a novel crash-consistency mechanism, Synchronous Soft Updates, that boils down crash safety to enforcing ordering among updates to file-system metadata. We employ this approach to build SquirrelFS, a new file system with crash-consistency guarantees that are checked at compile time. SquirrelFS avoids the need for separate proofs, instead incorporating correctness guarantees into the typestate itself. Compiling SquirrelFS only takes tens of seconds; successful compilation indicates crash consistency, while an error provides a starting point for fixing the bug. We evaluate SquirrelFS against state of the art file systems such as NOVA and WineFS, and find that SquirrelFS achieves similar or better performance on a wide range of benchmarks and applications.

en cs.OS

Detail Sumber

arXiv Open Access 2024

Characterizing Network Requirements for GPU API Remoting in AI Applications

Tianxia Wang, Zhuofu Chen, Xingda Wei et al.

GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation for AI applications. Our study including theoretical model demonstrates that, with careful remoting design, unmodified AI applications can run on the remoting setup using commodity networking hardware without any overhead or even with better performance, with low network demands.

en cs.OS, cs.NI

Detail Sumber

arXiv Open Access 2023

Understanding (Un)Written Contracts of NVMe ZNS Devices with zns-tools

Nick Tehrany, Krijn Doekemeijer, Animesh Trivedi

Operational and performance characteristics of flash SSDs have long been associated with a set of Unwritten Contracts due to their hidden, complex internals and lack of control from the host software stack. These unwritten contracts govern how data should be stored, accessed, and garbage collected. The emergence of Zoned Namespace (ZNS) flash devices with their open and standardized interface allows us to write these unwritten contracts for the storage stack. However, even with a standardized storage-host interface, due to the lack of appropriate end-to-end operational data collection tools, the quantification and reasoning of such contracts remain a challenge. In this paper, we propose zns.tools, an open-source framework for end-to-end event and metadata collection, analysis, and visualization for the ZNS SSDs contract analysis. We showcase how zns.tools can be used to understand how the combination of RocksDB with the F2FS file system interacts with the underlying storage. Our tools are available openly at \url{https://github.com/stonet-research/zns-tools}.

en cs.OS

Detail Sumber

arXiv Open Access 2022

MUSTACHE: Multi-Step-Ahead Predictions for Cache Eviction

Gabriele Tolomei, Lorenzo Takanen, Fabio Pinelli

In this work, we propose MUSTACHE, a new page cache replacement algorithm whose logic is learned from observed memory access requests rather than fixed like existing policies. We formulate the page request prediction problem as a categorical time series forecasting task. Then, our method queries the learned page request forecaster to obtain the next $k$ predicted page memory references to better approximate the optimal Bélády's replacement algorithm. We implement several forecasting techniques using advanced deep learning architectures and integrate the best-performing one into an existing open-source cache simulator. Experiments run on benchmark datasets show that MUSTACHE outperforms the best page replacement heuristic (i.e., exact LRU), improving the cache hit ratio by 1.9% and reducing the number of reads/writes required to handle cache misses by 18.4% and 10.3%.

en cs.OS, cs.LG

Detail Sumber

arXiv Open Access 2019

A WCET-aware cache coloring technique for reducing interference in real-time systems

Fabien Bouquillon, Clément Ballabriga, Giuseppe Lipari et al.

The predictability of a system is the condition to give saferbound on worst case execution timeof real-time tasks which are running on it. Commercial off-the-shelf(COTS) processors are in-creasingly used in embedded systems and contain shared cache memory. This component hasa hard predictable behavior because its state depends of theexecution history of the systems.To increase predictability of COTS component we use cache coloring, a technique widely usedto partition cache memory. Our main contribution is a WCET aware heuristic which parti-tion task according to the needs of each task. Our experiments are made with CPLEX an ILPsolver with random tasks set generated running on preemptive system scheduled with earliestdeadline first(EDF).

en cs.OS, cs.DC

Detail Sumber

arXiv Open Access 2019

Exact Polynomial Time Algorithm for the Response Time Analysis of Harmonic Tasks with Constrained Release Jitter

Thi Huyen Chau Nguyen, Werner Grass, Klaus Jansen

In some important application areas of hard real-time systems, preemptive sporadic tasks with harmonic periods and constraint deadlines running upon a uni-processor platform play an important role. We propose a new algorithm for determining the exact worst-case response time for a task that has a lower computational complexity (linear in the number of tasks) than the known algorithm developed for the same system class. We also allow the task executions to start delayed due to release jitter if they are within certain value ranges. For checking if these constraints are met we define a constraint programming problem that has a special structure and can be solved with heuristic components in a time that is linear in the task number. If the check determines the admissibility of the jitter values, the linear time algorithm can be used to determine the worst-case response time also for jitter-aware systems.

en cs.OS

Detail Sumber

arXiv Open Access 2018

Real-time Linux communications: an evaluation of the Linux communication stack for real-time robotic applications

Carlos San Vicente Gutiérrez, Lander Usategui San Juan, Irati Zamalloa Ugarte et al.

As robotics systems become more distributed, the communications between different robot modules play a key role for the reliability of the overall robot control. In this paper, we present a study of the Linux communication stack meant for real-time robotic applications. We evaluate the real-time performance of UDP based communications in Linux on multi-core embedded devices as test platforms. We prove that, under an appropriate configuration, the Linux kernel greatly enhances the determinism of communications using the UDP protocol. Furthermore, we demonstrate that concurrent traffic disrupts the bounded latencies and propose a solution by separating the real-time application and the corresponding interrupt in a CPU.

en cs.OS

Detail Sumber

arXiv Open Access 2018

Modeling Processor Idle Times in MPSoC Platforms to Enable Integrated DPM, DVFS, and Task Scheduling Subject to a Hard Deadline

Amirhossein Esmaili, Mahdi Nazemi, Massoud Pedram

Energy efficiency is one of the most critical design criteria for modern embedded systems such as multiprocessor system-on-chips (MPSoCs). Dynamic voltage and frequency scaling (DVFS) and dynamic power management (DPM) are two major techniques for reducing energy consumption in such embedded systems. Furthermore, MPSoCs are becoming more popular for many real-time applications. One of the challenges of integrating DPM with DVFS and task scheduling of real-time applications on MPSoCs is the modeling of idle intervals on these platforms. In this paper, we present a novel approach for modeling idle intervals in MPSoC platforms which leads to a mixed integer linear programming (MILP) formulation integrating DPM, DVFS, and task scheduling of periodic task graphs subject to a hard deadline. We also present a heuristic approach for solving the MILP and compare its results with those obtained from solving the MILP.

en cs.OS, cs.DC

Detail Sumber

arXiv Open Access 2017

Memos: Revisiting Hybrid Memory Management in Modern Operating System

Lei Liu, Mengyao Xie, Hao Yang

The emerging hybrid DRAM-NVM architecture is challenging the existing memory management mechanism in operating system. In this paper, we introduce memos, which can schedule memory resources over the entire memory hierarchy including cache, channels, main memory comprising DRAM and NVM simultaneously. Powered by our newly designed kernel-level monitoring module and page migration engine, memos can dynamically optimize the data placement at the memory hierarchy in terms of the on-line memory patterns, current resource utilization and feature of memory medium. Our experimental results show that memos can achieve high memory utilization, contributing to system throughput by 19.1% and QoS by 23.6% on average. Moreover, memos can reduce the NVM side memory latency by 3~83.3%, energy consumption by 25.1~99%, and benefit the NVM lifetime significantly (40X improvement on average).

en cs.OS, cs.AR

Detail Sumber

arXiv Open Access 2017

Migrate when necessary: toward partitioned reclaiming for soft real-time tasks

Houssam Eddine Zahaf, Giuseppe Lipari, Luca Abeni et al.

This paper presents a new strategy for scheduling soft real-time tasks on multiple identical cores. The proposed approach is based on partitioned CPU reservations and it uses a reclaiming mechanism to reduce the number of missed deadlines. We introduce the possibility for a task to temporarily migrate to another, less charged, CPU when it has exhausted the reserved bandwidth on its allocated CPU. In addition, we propose a simple load balancing method to decrease the number of deadlines missed by the tasks. The proposed algorithm has been evaluated through simulations, showing its effectiveness (compared to other multi-core reclaiming approaches) and comparing the performance of different partitioning heuristics (Best Fit, Worst Fit and First Fit).

en cs.OS

Detail DOI Sumber

CrossRef Open Access 2016

Supplemental Literature Review of Binary Phase Diagrams: Ag-Yb, Al-Co, Al-I, Co-Cr, Cs-Te, In-Sr, Mg-Tl, Mn-Pd, Mo-O, Mo-Re, Ni-Os, and V-Zr

H. Okamoto

23 sitasi en

Detail DOI Sumber

arXiv Open Access 2014

Effects of Hard Real-Time Constraints in Implementing the Myopic Scheduling Algorithm

Kazi Sakib, M. S. Hasan, M. A. Hossain

Myopic is a hard real-time process scheduling algorithm that selects a suitable process based on a heuristic function from a subset (Window)of all ready processes instead of choosing from all available processes, like original heuristic scheduling algorithm. Performance of the algorithm significantly depends on the chosen heuristic function that assigns weight to different parameters like deadline, earliest starting time, processing time etc. and the sizeof the Window since it considers only k processes from n processes (where, k<= n). This research evaluates the performance of the Myopic algorithm for different parameters to demonstrate the merits and constraints of the algorithm. A comparative performance of the impact of window size in implementing the Myopic algorithm is presented and discussed through a set of experiments.

en cs.OS, cs.DC

Detail Sumber

arXiv Open Access 2014

Toward Parametric Timed Interfaces for Real-Time Components

Youcheng Sun, Giuseppe Lipari, Étienne André et al.

We propose here a framework to model real-time components consisting of concurrent real-time tasks running on a single processor, using parametric timed automata. Our framework is generic and modular, so as to be easily adapted to different schedulers and more complex task models. We first perform a parametric schedulability analysis of the components using the inverse method. We show that the method unfortunately does not provide satisfactory results when the task periods are consid- ered as parameters. After identifying and explaining the problem, we present a solution adapting the model by making use of the worst-case scenario in schedulability analysis. We show that the analysis with the inverse method always converges on the modified model when the system load is strictly less than 100%. Finally, we show how to use our parametric analysis for the generation of timed interfaces in compositional system design.

en cs.OS

Detail DOI Sumber

arXiv Open Access 2013

Network Control Systems RTAI framework A Review

Deepika Bhatia, Urmila Shrawankar

With the advancement in the automation industry, to perform complex remote operations is required. Advancements in the networking technology has led to the development of different architectures to implement control from a large distance. In various control applications of the modern industry, the agents, such as sensors, actuators, and controllers are basically geographically distributed. For efficient working of a control application, all of the agents have to exchange information through a communication media. At present, an increasing number of distributed control systems are based on platforms made up of conventional PCs running open-source real-time operating systems. Often, these systems needed to have networked devices supporting synchronized operations with respect to each node. A framework is studied that relies on standard software and protocol as RTAI, EtherCAT, RTnet and IEEE 1588. RTAI and its various protocols are studied in network control systems environment.

en cs.OS

Detail Sumber

arXiv Open Access 2012

Multicore Dynamic Kernel Modules Attachment Technique for Kernel Performance Enhancement

Mohamed Farag

Traditional monolithic kernels dominated kernel structures for long time along with small sized kernels,few hardware companies and limited kernel functionalities. Monolithic kernel structure was not applicable when the number of hardware companies increased and kernel services consumed by different users for many purposes. One of the biggest disadvantages of the monolithic kernels is the inflexibility due to the need to include all the available modules in kernel compilation causing high time consuming. Lately, new kernel structure was introduced through multicore operating systems. Unfortunately, many multicore operating systems such as barrelfish and FOS are experimental. This paper aims to simulate the performance of multicore hybrid kernels through dynamic kernel module customized attachment/ deattachment for multicore machines. In addition, this paper proposes a new technique for loading dynamic kernel modules based on the user needs and machine capabilities.

en cs.OS

Detail DOI Sumber

arXiv Open Access 2012

JooFlux: Hijacking Java 7 InvokeDynamic To Support Live Code Modifications

Julien Ponge, Frédéric Le Mouël

Changing functional and non-functional software implementation at runtime is useful and even sometimes critical both in development and production environments. JooFlux is a JVM agent that allows both the dynamic replacement of method implementations and the application of aspect advices. It works by doing bytecode transformation to take advantage of the new invokedynamic instruction added in Java SE 7 to help implementing dynamic languages for the JVM. JooFlux can be managed using a JMX agent so as to operate dynamic modifications at runtime, without resorting to a dedicated domain-specific language. We compared JooFlux with existing AOP platforms and dynamic languages. Results demonstrate that JooFlux performances are close to the Java ones --- with most of the time a marginal overhead, and sometimes a gain --- where AOP platforms and dynamic languages present significant overheads. This paves the way for interesting future evolutions and applications of JooFlux.

en cs.OS, cs.PL

Detail Sumber

Hasil untuk "cs.OS"