arXiv Open Access 2026

Chinese Labor Law Large Language Model Benchmark

Zixun Lan Maochun Xu Yifan Ren Rui Wu Jianghui Zhou +5 lainnya
Lihat Sumber

Abstrak

Recent advances in large language models (LLMs) have led to substantial progress in domain-specific applications, particularly within the legal domain. However, general-purpose models such as GPT-4 often struggle with specialized subdomains that require precise legal knowledge, complex reasoning, and contextual sensitivity. To address these limitations, we present LabourLawLLM, a legal large language model tailored to Chinese labor law. We also introduce LabourLawBench, a comprehensive benchmark covering diverse labor-law tasks, including legal provision citation, knowledge-based question answering, case classification, compensation computation, named entity recognition, and legal case analysis. Our evaluation framework combines objective metrics (e.g., ROUGE-L, accuracy, F1, and soft-F1) with subjective assessment based on GPT-4 scoring. Experiments show that LabourLawLLM consistently outperforms general-purpose and existing legal-specific LLMs across task categories. Beyond labor law, our methodology provides a scalable approach for building specialized LLMs in other legal subfields, improving accuracy, reliability, and societal value of legal AI applications.

Topik & Kata Kunci

Penulis (10)

Z

Zixun Lan

M

Maochun Xu

Y

Yifan Ren

R

Rui Wu

J

Jianghui Zhou

X

Xueyang Cheng

J

Jianan Ding Ding

X

Xinheng Wang

M

Mingmin Chi

F

Fei Ma

Format Sitasi

Lan, Z., Xu, M., Ren, Y., Wu, R., Zhou, J., Cheng, X. et al. (2026). Chinese Labor Law Large Language Model Benchmark. https://arxiv.org/abs/2601.09972

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓