arXiv Open Access 2026

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

Yuchen Yang Yuqing Shao Duxiu Huang Linfeng Dong Yifei Liu +9 lainnya
Lihat Sumber

Abstrak

Sports have long attracted broad attention as they push the limits of human physical and cognitive capabilities. Amid growing interest in spatial intelligence for vision-language models (VLMs), sports provide a natural testbed for understanding high-intensity human motion and dynamic object interactions. To this end, we present CourtSI, the first large-scale spatial intelligence dataset tailored to sports scenarios. CourtSI contains over 1M QA pairs, organized under a holistic taxonomy that systematically covers spatial counting, distance measurement, localization, and relational reasoning, across representative net sports including badminton, tennis, and table tennis. Leveraging well-defined court geometry as metric anchors, we develop a semi-automatic data engine to reconstruct sports scenes, enabling scalable curation of CourtSI. In addition, we introduce CourtSI-Bench, a high-quality evaluation benchmark comprising 3,686 QA pairs with rigorous human verification. We evaluate 25 proprietary and open-source VLMs on CourtSI-Bench, revealing a remaining human-AI performance gap and limited generalization from existing spatial intelligence benchmarks. These findings indicate that sports scenarios expose limitations in spatial intelligence capabilities captured by existing benchmarks. Further, fine-tuning Qwen3-VL-8B on CourtSI improves accuracy on CourtSI-Bench by 23.5 percentage points. The adapted model also generalizes effectively to CourtSI-Ext, an evaluation set built on a similar but unseen sport, and demonstrates enhanced spatial-aware commentary generation. Together, these findings demonstrate that CourtSI provides a scalable pathway toward advancing spatial intelligence of VLMs in sports.

Topik & Kata Kunci

Penulis (14)

Y

Yuchen Yang

Y

Yuqing Shao

D

Duxiu Huang

L

Linfeng Dong

Y

Yifei Liu

S

Suixin Tang

X

Xiang Zhou

Y

Yuanyuan Gao

W

Wei Wang

Y

Yue Zhou

X

Xue Yang

Y

Yanfeng Wang

X

Xiao Sun

Z

Zhihang Zhong

Format Sitasi

Yang, Y., Shao, Y., Huang, D., Dong, L., Liu, Y., Tang, S. et al. (2026). Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports. https://arxiv.org/abs/2603.09896

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓