Seq2Bind Webserver for Decoding Binding Hotspots directly from Sequences using Fine-Tuned Protein Language Models
Abstrak
Decoding protein-protein interactions (PPIs) at the residue level is crucial for understanding cellular mechanisms and developing targeted therapeutics. We present Seq2Bind Webserver, a computational framework that leverages fine-tuned protein language models (PLMs) to determine binding affinity between proteins and identify critical binding residues directly from sequences, eliminating the structural requirements that limit most affinity prediction tools. We fine-tuned four architectures including ProtBERT, ProtT5, ESM2, and BiLSTM on the SKEMPI 2.0 dataset containing 5,387 protein pairs with experimental binding affinities. Through systematic alanine mutagenesis on each residue for 14 therapeutically relevant protein complexes, we evaluated each model's ability to identify interface residues. Performance was assessed using N-factor metrics, where N-factor=3 evaluates whether true residues appear within 3n top predictions for n interface residues. ESM2 achieved 49.5% accuracy at N-factor=3, with both ESM2 (37.2%) and ProtBERT (35.1%) outperforming structural docking method HADDOCK3 (32.1%) at N-factor=2. Our sequence-based approach enables rapid screening (minutes versus hours for docking), handles disordered proteins, and provides comparable accuracy, making Seq2Bind a valuable prior to steer blind docking protocols to identify putative binding residues from each protein for therapeutic targets. Seq2Bind Webserver is accessible at https://agrivax.onrender.com under StructF suite.
Topik & Kata Kunci
Penulis (6)
Xiang Ma
Supantha Dey
Vaishnavey SR
Casey Zelinski
Qi Li
Ratul Chowdhury
Akses Cepat
- Tahun Terbit
- 2025
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓