arXiv Open Access 2026

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Yushuo Zheng Huiyu Duan Zicheng Zhang Yucheng Zhu Xiongkuo Min +1 lainnya
Lihat Sumber

Abstrak

The ability of large language models (LLMs) to manage and acquire economic resources remains unclear. In this paper, we introduce \textbf{Market-Bench}, a comprehensive benchmark that evaluates the capabilities of LLMs in economically-relevant tasks through economic and trade competition. Specifically, we construct a configurable multi-agent supply chain economic model where LLMs act as retailer agents responsible for procuring and retailing merchandise. In the \textbf{procurement} stage, LLMs bid for limited inventory in budget-constrained auctions. In the \textbf{retail} stage, LLMs set retail prices, generate marketing slogans, and provide them to buyers through a role-based attention mechanism for purchase. Market-Bench logs complete trajectories of bids, prices, slogans, sales, and balance-sheet states, enabling automatic evaluation with economic, operational, and semantic metrics. Benchmarking on 20 open- and closed-source LLM agents reveals significant performance disparities and winner-take-most phenomenon, \textit{i.e.}, only a small subset of LLM retailers can consistently achieve capital appreciation, while many hover around the break-even point despite similar semantic matching scores. Market-Bench provides a reproducible testbed for studying how LLMs interact in competitive markets.

Topik & Kata Kunci

Penulis (6)

Y

Yushuo Zheng

H

Huiyu Duan

Z

Zicheng Zhang

Y

Yucheng Zhu

X

Xiongkuo Min

G

Guangtao Zhai

Format Sitasi

Zheng, Y., Duan, H., Zhang, Z., Zhu, Y., Min, X., Zhai, G. (2026). Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition. https://arxiv.org/abs/2604.05523

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓