arXiv Open Access 2025

DOCUEVAL: An LLM-based AI Engineering Tool for Building Customisable Document Evaluation Workflows

Hao Zhang Qinghua Lu Liming Zhu

Lihat Sumber

Abstrak

Foundation models, such as large language models (LLMs), have the potential to streamline evaluation workflows and improve their performance. However, practical adoption faces challenges, such as customisability, accuracy, and scalability. In this paper, we present DOCUEVAL, an AI engineering tool for building customisable DOCUment EVALuation workflows. DOCUEVAL supports advanced document processing and customisable workflow design which allow users to define theory-grounded reviewer roles, specify evaluation criteria, experiment with different reasoning strategies and choose the assessment style. To ensure traceability, DOCUEVAL provides comprehensive logging of every run, along with source attribution and configuration management, allowing systematic comparison of results across alternative setups. By integrating these capabilities, DOCUEVAL directly addresses core software engineering challenges, including how to determine whether evaluators are "good enough" for deployment and how to empirically compare different evaluation strategies. We demonstrate the usefulness of DOCUEVAL through a real-world academic peer review case, showing how DOCUEVAL enables both the engineering of evaluators and scalable, reliable document evaluation.

Topik & Kata Kunci

cs.IR cs.AI

Penulis (3)

Hao Zhang

Qinghua Lu

Liming Zhu

Format Sitasi

APA MLA BibTeX

Zhang, H., Lu, Q., Zhu, L. (2025). DOCUEVAL: An LLM-based AI Engineering Tool for Building Customisable Document Evaluation Workflows. https://arxiv.org/abs/2511.05496

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓