arXiv Open Access 2026

DARA: Few-shot Budget Allocation in Online Advertising via In-Context Decision Making with RL-Finetuned LLMs

Mingxuan Song Yusen Huo Bohan Zhou Shenglin Yin Zhen Xiao +3 lainnya

Lihat Sumber

Abstrak

Optimizing the advertiser's cumulative value of winning impressions under budget constraints poses a complex challenge in online advertising, under the paradigm of AI-Generated Bidding (AIGB). Advertisers often have personalized objectives but limited historical interaction data, resulting in few-shot scenarios where traditional reinforcement learning (RL) methods struggle to perform effectively. Large Language Models (LLMs) offer a promising alternative for AIGB by leveraging their in-context learning capabilities to generalize from limited data. However, they lack the numerical precision required for fine-grained optimization. To address this limitation, we introduce GRPO-Adaptive, an efficient LLM post-training strategy that enhances both reasoning and numerical precision by dynamically updating the reference policy during training. Built upon this foundation, we further propose DARA, a novel dual-phase framework that decomposes the decision-making process into two stages: a few-shot reasoner that generates initial plans via in-context prompting, and a fine-grained optimizer that refines these plans using feedback-driven reasoning. This separation allows DARA to combine LLMs' in-context learning strengths with precise adaptability required by AIGB tasks. Extensive experiments on both real-world and synthetic data environments demonstrate that our approach consistently outperforms existing baselines in terms of cumulative advertiser value under budget constraints.

Topik & Kata Kunci

cs.AI cs.LG

Penulis (8)

Mingxuan Song

Yusen Huo

Bohan Zhou

Shenglin Yin

Zhen Xiao

Jieyi Long

Zhilin Zhang

Chuan Yu

Format Sitasi

APA MLA BibTeX

Song, M., Huo, Y., Zhou, B., Yin, S., Xiao, Z., Long, J. et al. (2026). DARA: Few-shot Budget Allocation in Online Advertising via In-Context Decision Making with RL-Finetuned LLMs. https://arxiv.org/abs/2601.14711

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓