arXiv Open Access 2026

Chinese Labor Law Large Language Model Benchmark

Zixun Lan Maochun Xu Yifan Ren Rui Wu Jianghui Zhou +5 lainnya

Lihat Sumber

Abstrak

Recent advances in large language models (LLMs) have led to substantial progress in domain-specific applications, particularly within the legal domain. However, general-purpose models such as GPT-4 often struggle with specialized subdomains that require precise legal knowledge, complex reasoning, and contextual sensitivity. To address these limitations, we present LabourLawLLM, a legal large language model tailored to Chinese labor law. We also introduce LabourLawBench, a comprehensive benchmark covering diverse labor-law tasks, including legal provision citation, knowledge-based question answering, case classification, compensation computation, named entity recognition, and legal case analysis. Our evaluation framework combines objective metrics (e.g., ROUGE-L, accuracy, F1, and soft-F1) with subjective assessment based on GPT-4 scoring. Experiments show that LabourLawLLM consistently outperforms general-purpose and existing legal-specific LLMs across task categories. Beyond labor law, our methodology provides a scalable approach for building specialized LLMs in other legal subfields, improving accuracy, reliability, and societal value of legal AI applications.

Topik & Kata Kunci

cs.AI

Penulis (10)

Zixun Lan

Maochun Xu

Yifan Ren

Rui Wu

Jianghui Zhou

Xueyang Cheng

Jianan Ding Ding

Xinheng Wang

Mingmin Chi

Fei Ma

Format Sitasi

APA MLA BibTeX

Lan, Z., Xu, M., Ren, Y., Wu, R., Zhou, J., Cheng, X. et al. (2026). Chinese Labor Law Large Language Model Benchmark. https://arxiv.org/abs/2601.09972

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓