Semantic Scholar Open Access 2024 4 sitasi

Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition

Yurong Zhang Honghao Chen Xinyu Zhang Xiangxiang Chu Li Song

Abstrak

Parameter-efficient transfer learning (PETL) is a promising task, aiming to adapt the large-scale pre-trained model to downstream tasks with a relatively modest cost. However, current PETL methods struggle in compressing computational complexity and bear a heavy inference burden due to the complete forward process. This paper presents an efficient visual recognition paradigm, called Dynamic Adapter (Dyn-Adapter), that boosts PETL efficiency by subtly disentangling features in multiple levels. Our approach is simple: first, we devise a dynamic architecture with balanced early heads for multi-level feature extraction, along with adaptive training strategy. Second, we introduce a bidirectional sparsity strategy driven by the pursuit of powerful generalization ability. These qualities enable us to fine-tune efficiently and effectively: we reduce FLOPs during inference by 50%, while maintaining or even yielding higher recognition accuracy. Extensive experiments on diverse datasets and pretrained backbones demonstrate the potential of Dyn-Adapter serving as a general efficiency booster for PETL in vision recognition tasks.

Topik & Kata Kunci

Penulis (5)

Y

Yurong Zhang

H

Honghao Chen

X

Xinyu Zhang

X

Xiangxiang Chu

L

Li Song

Format Sitasi

Zhang, Y., Chen, H., Zhang, X., Chu, X., Song, L. (2024). Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition. https://doi.org/10.48550/arXiv.2407.14302

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.48550/arXiv.2407.14302
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Total Sitasi
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2407.14302
Akses
Open Access ✓