Semantic Scholar Open Access 2024 100 sitasi

Unsupervised evolution of protein and antibody complexes with a structure-informed language model

Varun R. Shanker Theodora U. J. Bruun Brian L. Hie Peter S. Kim

Abstrak

Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data. Editor’s summary Despite tremendous advances in protein structure prediction, connecting sequence to function is key for the in silico engineering of proteins for various tasks. Focusing on the problem of antibody engineering, Shanker et al. used a structure-informed protein language model to predict high-fitness sequences constrained by the known structure of the antibody or antibody-antigen complex. In experimental screens of virus-neutralizing antibodies, the authors observed substantial improvement in binding affinity and neutralization for their predicted sequences. These results demonstrate the potential for machine learning and protein language models trained on protein sequence information to contribute to protein engineering tasks even in the absence of task-specific training data. —Michael A. Funk

Topik & Kata Kunci

Penulis (4)

V

Varun R. Shanker

T

Theodora U. J. Bruun

B

Brian L. Hie

P

Peter S. Kim

Format Sitasi

Shanker, V.R., Bruun, T.U.J., Hie, B.L., Kim, P.S. (2024). Unsupervised evolution of protein and antibody complexes with a structure-informed language model. https://doi.org/10.1126/science.adk8946

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1126/science.adk8946
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Total Sitasi
100×
Sumber Database
Semantic Scholar
DOI
10.1126/science.adk8946
Akses
Open Access ✓