DOAJ Open Access 2025

Transformer-Based Classification of Transposable Element Consensus Sequences with TEclass2

Lucas Bickmann Matias Rodriguez Xiaoyi Jiang Wojciech Makałowski

Abstrak

Transposable elements (TEs) constitute a significant portion of eukaryotic genomes and play crucial roles in genome evolution, yet their diverse and complex sequences pose challenges for accurate classification. Existing tools often lack reliability in TE classification, limiting genomic analyses. Here, we present TEclass2, a software employing a deep learning approach based on a linear transformer architecture with k-mer tokenization and sequence-specific adaptations to classify TE consensus sequences into sixteen superfamilies. TEclass2 demonstrates improved classification performance and offers flexible model training on custom datasets. Accessible via a web interface with pre-trained models, TEclass2 facilitates rapid and reliable TE classification. These advancements provide a foundation for enhanced genomic annotation and support further bioinformatics research involving transposable elements.

Topik & Kata Kunci

Penulis (4)

L

Lucas Bickmann

M

Matias Rodriguez

X

Xiaoyi Jiang

W

Wojciech Makałowski

Format Sitasi

Bickmann, L., Rodriguez, M., Jiang, X., Makałowski, W. (2025). Transformer-Based Classification of Transposable Element Consensus Sequences with TEclass2. https://doi.org/10.3390/biology15010059

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/biology15010059
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.3390/biology15010059
Akses
Open Access ✓