arXiv Open Access 2023

T cell receptor binding prediction: A machine learning revolution

Anna Weber Aurélien Pélissier María Rodríguez Martínez
Lihat Sumber

Abstrak

Recent advancements in immune sequencing and experimental techniques are generating extensive T cell receptor (TCR) repertoire data, enabling the development of models to predict TCR binding specificity. Despite the computational challenges due to the vast diversity of TCRs and epitopes, significant progress has been made. This paper discusses the evolution of the computational models developed for this task, with a focus on machine learning efforts, including the early unsupervised clustering approaches, supervised models, and the more recent applications of Protein Language Models (PLMs). We critically assess the most prominent models in each category, and discuss recurrent challenges, such as the lack of generalization to new epitopes, dataset biases, and biases in the validation design of the models. Furthermore, our paper discusses the transformative role of transformer-based protein models in bioinformatics. These models, pretrained on extensive collections of unlabeled protein sequences, can convert amino acid sequences into vectorized embeddings that capture important biological properties. We discuss recent attempts to leverage PLMs to deliver very competitive performances in TCR-related tasks. Finally, we address the pressing need for improved interpretability in these often opaque models, proposing strategies to amplify their impact in the field.

Penulis (3)

A

Anna Weber

A

Aurélien Pélissier

M

María Rodríguez Martínez

Format Sitasi

Weber, A., Pélissier, A., Martínez, M.R. (2023). T cell receptor binding prediction: A machine learning revolution. https://arxiv.org/abs/2312.16594

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓