DOAJ Open Access 2025

ChemLit-QA: a human evaluated dataset for chemistry RAG tasks

Geemi P Wellawatte Huixuan Guo Magdalena Lederbauer Anna Borisova Matthew Hart +2 lainnya

Abstrak

Retrieval-Augmented Generation (RAG) is a widely used strategy in Large-Language Models (LLMs) to extrapolate beyond the inherent pre-trained knowledge. Hence, RAG is crucial when working in data-sparse fields such as Chemistry. The evaluation of RAG systems is commonly conducted using specialized datasets. However, existing datasets, typically in the form of scientific Question-Answer-Context (QAC) triplets or QA pairs, are often limited in size due to the labor-intensive nature of manual curation or require further quality assessment when generated through automated processes. This highlights a critical need for large, high-quality datasets tailored to scientific applications. We introduce ChemLit-QA, a comprehensive, expert-validated, open-source dataset comprising over 1,000 entries specifically designed for chemistry. Our approach involves the initial generation and filtering of a QAC dataset using an automated framework based on GPT-4 Turbo, followed by rigorous evaluation by chemistry experts. Additionally, we provide two supplementary datasets: ChemLit-QA-neg focused on negative data, and ChemLit-QA-multi focused on multihop reasoning tasks for LLMs, which complement the main dataset on hallucination detection and more reasoning-intensive tasks.

Topik & Kata Kunci

Computer engineering. Computer hardware Electronic computers. Computer science

Penulis (7)

Geemi P Wellawatte

Huixuan Guo

Magdalena Lederbauer

Anna Borisova

Matthew Hart

Marta Brucka

Philippe Schwaller

Format Sitasi

APA MLA BibTeX

Wellawatte, G.P., Guo, H., Lederbauer, M., Borisova, A., Hart, M., Brucka, M. et al. (2025). ChemLit-QA: a human evaluated dataset for chemistry RAG tasks. https://doi.org/10.1088/2632-2153/adc2d6

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.1088/2632-2153/adc2d6

Informasi Jurnal

Tahun Terbit: 2025
Sumber Database: DOAJ
DOI: 10.1088/2632-2153/adc2d6
Akses: Open Access ✓