CrossRef Open Access 2025

CryoDataBot: a pipeline to curate cryoEM datasets for AI-driven structural biology

Qibo Xu Leon Wu Michael Rebelo Shi Feng Xinye Yu +3 lainnya

Abstrak

Abstract Cryogenic electron microscopy (cryoEM) has revolutionized structural biology by enabling atomic-resolution visualization of biomacromolecules. To automate atomic model building from cryoEM maps, artificial intelligence (AI) methods have emerged as powerful tools. Although high-quality, task-specific datasets play a critical role in AI-based modeling, assembling such resources often requires considerable effort and domain expertise. We present CryoDataBot, an automated pipeline that addresses this gap. It streamlines data retrieval, preprocessing, and labeling, with fine-grained quality control and flexible customization, enabling efficient generation of robust datasets. CryoDataBot’s effectiveness is demonstrated through improved training efficiency in U-Net models and rapid, effective retraining of CryoREAD, a widely used RNA modeling tool. By simplifying the workflow and offering customizable quality control, CryoDataBot enables researchers to easily tailor dataset construction to the specific objectives of their models, while ensuring high data quality and reducing manual workload. This flexibility supports a wide range of applications in AI-driven structural biology.

Penulis (8)

Q

Qibo Xu

L

Leon Wu

M

Michael Rebelo

S

Shi Feng

X

Xinye Yu

F

Farhanaz Farheen

D

Daisuke Kihara

Z

Z. Hong Zhou

Format Sitasi

Xu, Q., Wu, L., Rebelo, M., Feng, S., Yu, X., Farheen, F. et al. (2025). CryoDataBot: a pipeline to curate cryoEM datasets for AI-driven structural biology. https://doi.org/10.1101/2025.09.09.675185

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1101/2025.09.09.675185
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
CrossRef
DOI
10.1101/2025.09.09.675185
Akses
Open Access ✓