DOAJ Open Access 2025

ChemLit-QA: a human evaluated dataset for chemistry RAG tasks

Geemi P Wellawatte Huixuan Guo Magdalena Lederbauer Anna Borisova Matthew Hart +2 lainnya

Abstrak

Retrieval-Augmented Generation (RAG) is a widely used strategy in Large-Language Models (LLMs) to extrapolate beyond the inherent pre-trained knowledge. Hence, RAG is crucial when working in data-sparse fields such as Chemistry. The evaluation of RAG systems is commonly conducted using specialized datasets. However, existing datasets, typically in the form of scientific Question-Answer-Context (QAC) triplets or QA pairs, are often limited in size due to the labor-intensive nature of manual curation or require further quality assessment when generated through automated processes. This highlights a critical need for large, high-quality datasets tailored to scientific applications. We introduce ChemLit-QA, a comprehensive, expert-validated, open-source dataset comprising over 1,000 entries specifically designed for chemistry. Our approach involves the initial generation and filtering of a QAC dataset using an automated framework based on GPT-4 Turbo, followed by rigorous evaluation by chemistry experts. Additionally, we provide two supplementary datasets: ChemLit-QA-neg focused on negative data, and ChemLit-QA-multi focused on multihop reasoning tasks for LLMs, which complement the main dataset on hallucination detection and more reasoning-intensive tasks.

Penulis (7)

G

Geemi P Wellawatte

H

Huixuan Guo

M

Magdalena Lederbauer

A

Anna Borisova

M

Matthew Hart

M

Marta Brucka

P

Philippe Schwaller

Format Sitasi

Wellawatte, G.P., Guo, H., Lederbauer, M., Borisova, A., Hart, M., Brucka, M. et al. (2025). ChemLit-QA: a human evaluated dataset for chemistry RAG tasks. https://doi.org/10.1088/2632-2153/adc2d6

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1088/2632-2153/adc2d6
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.1088/2632-2153/adc2d6
Akses
Open Access ✓