arXiv Open Access 2026

Ebisu: Benchmarking Large Language Models in Japanese Finance

Xueqing Peng Ruoyu Xiang Fan Zhang Mingzi Song Mingyang Jiang +7 lainnya
Lihat Sumber

Abstrak

Japanese finance combines agglutinative, head-final linguistic structure, mixed writing systems, and high-context communication norms that rely on indirect expression and implicit commitment, posing a substantial challenge for LLMs. We introduce Ebisu, a benchmark for native Japanese financial language understanding, comprising two linguistically and culturally grounded, expert-annotated tasks: JF-ICR, which evaluates implicit commitment and refusal recognition in investor-facing Q&A, and JF-TE, which assesses hierarchical extraction and ranking of nested financial terminology from professional disclosures. We evaluate a diverse set of open-source and proprietary LLMs spanning general-purpose, Japanese-adapted, and financial models. Results show that even state-of-the-art systems struggle on both tasks. While increased model scale yields limited improvements, language- and domain-specific adaptation does not reliably improve performance, leaving substantial gaps unresolved. Ebisu provides a focused benchmark for advancing linguistically and culturally grounded financial NLP. All datasets and evaluation scripts are publicly released.

Topik & Kata Kunci

Penulis (12)

X

Xueqing Peng

R

Ruoyu Xiang

F

Fan Zhang

M

Mingzi Song

M

Mingyang Jiang

Y

Yan Wang

L

Lingfei Qian

T

Taiki Hara

Y

Yuqing Guo

J

Jimin Huang

J

Junichi Tsujii

S

Sophia Ananiadou

Format Sitasi

Peng, X., Xiang, R., Zhang, F., Song, M., Jiang, M., Wang, Y. et al. (2026). Ebisu: Benchmarking Large Language Models in Japanese Finance. https://arxiv.org/abs/2602.01479

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓