arXiv Open Access 2026

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Yushuo Zheng Huiyu Duan Zicheng Zhang Yucheng Zhu Xiongkuo Min +1 lainnya

Lihat Sumber

Abstrak

The ability of large language models (LLMs) to manage and acquire economic resources remains unclear. In this paper, we introduce \textbf{Market-Bench}, a comprehensive benchmark that evaluates the capabilities of LLMs in economically-relevant tasks through economic and trade competition. Specifically, we construct a configurable multi-agent supply chain economic model where LLMs act as retailer agents responsible for procuring and retailing merchandise. In the \textbf{procurement} stage, LLMs bid for limited inventory in budget-constrained auctions. In the \textbf{retail} stage, LLMs set retail prices, generate marketing slogans, and provide them to buyers through a role-based attention mechanism for purchase. Market-Bench logs complete trajectories of bids, prices, slogans, sales, and balance-sheet states, enabling automatic evaluation with economic, operational, and semantic metrics. Benchmarking on 20 open- and closed-source LLM agents reveals significant performance disparities and winner-take-most phenomenon, \textit{i.e.}, only a small subset of LLM retailers can consistently achieve capital appreciation, while many hover around the break-even point despite similar semantic matching scores. Market-Bench provides a reproducible testbed for studying how LLMs interact in competitive markets.

Topik & Kata Kunci

cs.AI

Penulis (6)

Yushuo Zheng

Huiyu Duan

Zicheng Zhang

Yucheng Zhu

Xiongkuo Min

Guangtao Zhai

Format Sitasi

APA MLA BibTeX

Zheng, Y., Duan, H., Zhang, Z., Zhu, Y., Min, X., Zhai, G. (2026). Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition. https://arxiv.org/abs/2604.05523

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓