arXiv Open Access 2024

Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes

Samuel S. Welborn Bjoern Enders Chris Harris Peter Ercius Deborah J. Bard
Lihat Sumber

Abstrak

Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and timely data processing. However, movement of large data sets from acquisition systems to HPC centers introduces bottlenecks owing to storage I/O at both ends. This manuscript introduces a streaming workflow designed for an high data rate electron detector that streams data directly to compute node memory at the National Energy Research Scientific Computing Center (NERSC), thereby avoiding storage I/O. The new workflow deploys ZeroMQ-based services for data production, aggregation, and distribution for on-the-fly processing, all coordinated through a distributed key-value store. The system is integrated with the detector's science gateway and utilizes the NERSC Superfacility API to initiate streaming jobs through a web-based frontend. Our approach achieves up to a 14-fold increase in data throughput and enhances predictability and reliability compared to a I/O-heavy file-based transfer workflow. Our work highlights the transformative potential of streaming workflows to expedite data analysis for time-sensitive experiments.

Topik & Kata Kunci

Penulis (5)

S

Samuel S. Welborn

B

Bjoern Enders

C

Chris Harris

P

Peter Ercius

D

Deborah J. Bard

Format Sitasi

Welborn, S.S., Enders, B., Harris, C., Ercius, P., Bard, D.J. (2024). Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes. https://arxiv.org/abs/2403.14352

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓