arXiv Open Access 2025

Active Learning for Machine Learning Driven Molecular Dynamics

Kevin Bachelor Sanya Murdeshwar Daniel Sabo Razvan Marinescu
Lihat Sumber

Abstrak

Machine-learned coarse-grained (CG) potentials are fast, but degrade over time when simulations reach under-sampled bio-molecular conformations, and generating widespread all-atom (AA) data to combat this is computationally infeasible. We propose a novel active learning (AL) framework for CG neural network potentials in molecular dynamics (MD). Building on the CGSchNet model, our method employs root mean squared deviation (RMSD)-based frame selection from MD simulations in order to generate data on-the-fly by querying an oracle during the training of a neural network potential. This framework preserves CG-level efficiency while correcting the model at precise, RMSD-identified coverage gaps. By training CGSchNet, a coarse-grained neural network potential, we empirically show that our framework explores previously unseen configurations and trains the model on unexplored regions of conformational space. Our active learning framework enables a CGSchNet model trained on the Chignolin protein to achieve a 33.05\% improvement in the Wasserstein-1 (W1) metric in Time-lagged Independent Component Analysis (TICA) space on an in-house benchmark suite.

Topik & Kata Kunci

Penulis (4)

K

Kevin Bachelor

S

Sanya Murdeshwar

D

Daniel Sabo

R

Razvan Marinescu

Format Sitasi

Bachelor, K., Murdeshwar, S., Sabo, D., Marinescu, R. (2025). Active Learning for Machine Learning Driven Molecular Dynamics. https://arxiv.org/abs/2509.17208

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓