arXiv Open Access 2026

ASA: Training-Free Representation Engineering for Tool-Calling Agents

Youjin Wang Run Zhou Rong Fu Shuaishuai Cao Hongwei Zeng +4 lainnya
Lihat Sumber

Abstrak

Adapting LLM agents to domain-specific tool calling remains notably brittle under evolving interfaces. Prompt and schema engineering is easy to deploy but often fragile under distribution shift and strict parsers, while continual parameter-efficient fine-tuning improves reliability at the cost of training, maintenance, and potential forgetting. We identify a critical Lazy Agent failure mode where tool necessity is nearly perfectly decodable from mid-layer activations, yet the model remains conservative in entering tool mode, revealing a representation-behavior gap. We propose Activation Steering Adapter (ASA), a training-free, inference-time controller that performs a single-shot mid-layer intervention and targets tool domains via a router-conditioned mixture of steering vectors with a probe-guided signed gate to amplify true intent while suppressing spurious triggers. On MTU-Bench with Qwen2.5-1.5B, ASA improves strict tool-use F1 from 0.18 to 0.50 while reducing the false positive rate from 0.15 to 0.05, using only about 20KB of portable assets and no weight updates.

Topik & Kata Kunci

Penulis (9)

Y

Youjin Wang

R

Run Zhou

R

Rong Fu

S

Shuaishuai Cao

H

Hongwei Zeng

J

Jiaxuan Lu

S

Sicheng Fan

J

Jiaqiao Zhao

L

Liangming Pan

Format Sitasi

Wang, Y., Zhou, R., Fu, R., Cao, S., Zeng, H., Lu, J. et al. (2026). ASA: Training-Free Representation Engineering for Tool-Calling Agents. https://arxiv.org/abs/2602.04935

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓