EvalQAG: A Framework for Automatic Complex QA Generation and a Benchmark QA Dataset for Policy Documents
Abstrak
Accelerating research in renewable energy policy is critical for addressing climate change and enabling informed decision-making. Question answering (QA) over public policy documents presents unique challenges due to their legal structure, conditional dependencies, and domain-specific vocabulary. In this paper, we introduce EvalQAG, a framework for generating high-quality QA pairs from renewable energy policy documents. EvalQAG combines structured prompts, retrieval-augmented inputs, and multi-stage evaluation using large language models (LLMs) to support accurate and diverse QA generation. Using this framework, we construct REPolicyQA, a domain-specific QA dataset comprising approximately 160,000 QA pairs from over 1,000 U.S. renewable energy policy documents. The dataset covers five policy-relevant question types: Yes/No, Yes/No with Conditions, Factual, Legal Obligation, and Descriptive, which capture a wide range of reasoning patterns grounded in regulatory texts. We evaluate multiple QA models and uncover significant performance gaps, particularly in legal reasoning and conditional inference, highlighting major shortcomings in current systems. Our results establish EvalQAG as a generalizable QA generation pipeline for policy texts and position REPolicyQA as a new benchmark for advancing QA research in policy and regulatory domains. We believe this work can foster impactful research in the renewable energy sector, particularly by enabling more robust and explainable QA systems for legal and condition-heavy regulatory documents.
Penulis (5)
Kirtan Brijeshbhai Soni
Krish Rupapara
Arpit Rana
Ghanshyam Verma
Paul Buitelaar
Akses Cepat
- Tahun Terbit
- 2026
- Bahasa
- en
- Sumber Database
- CrossRef
- DOI
- 10.1609/aaai.v40i46.41277
- Akses
- Terbatas