arXiv Open Access 2025

Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models

Alex Laitenberger Christopher D. Manning Nelson F. Liu

Lihat Sumber

Abstrak

With the rise of long-context language models (LMs) capable of processing tens of thousands of tokens in a single context window, do multi-stage retrieval-augmented generation (RAG) pipelines still offer measurable benefits over simpler, single-stage approaches? To assess this question, we conduct a controlled evaluation for QA tasks under systematically scaled token budgets, comparing two recent multi-stage pipelines, ReadAgent and RAPTOR, against three baselines, including DOS RAG (Document's Original Structure RAG), a simple retrieve-then-read method that preserves original passage order. Despite its straightforward design, DOS RAG consistently matches or outperforms more intricate methods on multiple long-context QA benchmarks. We trace this strength to a combination of maintaining source fidelity and document structure, prioritizing recall within effective context windows, and favoring simplicity over added pipeline complexity. We recommend establishing DOS RAG as a simple yet strong baseline for future RAG evaluations, paired with state-of-the-art embedding and language models, and benchmarked under matched token budgets, to ensure that added pipeline complexity is justified by clear performance gains as models continue to improve.

Topik & Kata Kunci

cs.CL

Penulis (3)

Alex Laitenberger

Christopher D. Manning

Nelson F. Liu

Format Sitasi

APA MLA BibTeX

Laitenberger, A., Manning, C.D., Liu, N.F. (2025). Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models. https://arxiv.org/abs/2506.03989

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓