Semantic Scholar Open Access 2024 4 sitasi

Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition

Yurong Zhang Honghao Chen Xinyu Zhang Xiangxiang Chu Li Song

Abstrak

Parameter-efficient transfer learning (PETL) is a promising task, aiming to adapt the large-scale pre-trained model to downstream tasks with a relatively modest cost. However, current PETL methods struggle in compressing computational complexity and bear a heavy inference burden due to the complete forward process. This paper presents an efficient visual recognition paradigm, called Dynamic Adapter (Dyn-Adapter), that boosts PETL efficiency by subtly disentangling features in multiple levels. Our approach is simple: first, we devise a dynamic architecture with balanced early heads for multi-level feature extraction, along with adaptive training strategy. Second, we introduce a bidirectional sparsity strategy driven by the pursuit of powerful generalization ability. These qualities enable us to fine-tune efficiently and effectively: we reduce FLOPs during inference by 50%, while maintaining or even yielding higher recognition accuracy. Extensive experiments on diverse datasets and pretrained backbones demonstrate the potential of Dyn-Adapter serving as a general efficiency booster for PETL in vision recognition tasks.

Topik & Kata Kunci

Computer Science

Penulis (5)

Yurong Zhang

Honghao Chen

Xinyu Zhang

Xiangxiang Chu

Li Song

Format Sitasi

APA MLA BibTeX

Zhang, Y., Chen, H., Zhang, X., Chu, X., Song, L. (2024). Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition. https://doi.org/10.48550/arXiv.2407.14302

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.48550/arXiv.2407.14302

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Total Sitasi: 4×
Sumber Database: Semantic Scholar
DOI: 10.48550/arXiv.2407.14302
Akses: Open Access ✓