arXiv Open Access 2025

Seq2Bind Webserver for Decoding Binding Hotspots directly from Sequences using Fine-Tuned Protein Language Models

Xiang Ma Supantha Dey Vaishnavey SR Casey Zelinski Qi Li +1 lainnya

Lihat Sumber

Abstrak

Decoding protein-protein interactions (PPIs) at the residue level is crucial for understanding cellular mechanisms and developing targeted therapeutics. We present Seq2Bind Webserver, a computational framework that leverages fine-tuned protein language models (PLMs) to determine binding affinity between proteins and identify critical binding residues directly from sequences, eliminating the structural requirements that limit most affinity prediction tools. We fine-tuned four architectures including ProtBERT, ProtT5, ESM2, and BiLSTM on the SKEMPI 2.0 dataset containing 5,387 protein pairs with experimental binding affinities. Through systematic alanine mutagenesis on each residue for 14 therapeutically relevant protein complexes, we evaluated each model's ability to identify interface residues. Performance was assessed using N-factor metrics, where N-factor=3 evaluates whether true residues appear within 3n top predictions for n interface residues. ESM2 achieved 49.5% accuracy at N-factor=3, with both ESM2 (37.2%) and ProtBERT (35.1%) outperforming structural docking method HADDOCK3 (32.1%) at N-factor=2. Our sequence-based approach enables rapid screening (minutes versus hours for docking), handles disordered proteins, and provides comparable accuracy, making Seq2Bind a valuable prior to steer blind docking protocols to identify putative binding residues from each protein for therapeutic targets. Seq2Bind Webserver is accessible at https://agrivax.onrender.com under StructF suite.

Topik & Kata Kunci

q-bio.QM

Penulis (6)

Xiang Ma

Supantha Dey

Vaishnavey SR

Casey Zelinski

Qi Li

Ratul Chowdhury

Format Sitasi

APA MLA BibTeX

Ma, X., Dey, S., SR, V., Zelinski, C., Li, Q., Chowdhury, R. (2025). Seq2Bind Webserver for Decoding Binding Hotspots directly from Sequences using Fine-Tuned Protein Language Models. https://arxiv.org/abs/2506.13830

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓