arXiv Open Access 2024

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Akari Asai Jacqueline He Rulin Shao Weijia Shi Amanpreet Singh +20 lainnya
Lihat Sumber

Abstrak

Scientific progress depends on researchers' ability to synthesize the growing body of literature. Can large language models (LMs) assist scientists in this task? We introduce OpenScholar, a specialized retrieval-augmented LM that answers scientific queries by identifying relevant passages from 45 million open-access papers and synthesizing citation-backed responses. To evaluate OpenScholar, we develop ScholarQABench, the first large-scale multi-domain benchmark for literature search, comprising 2,967 expert-written queries and 208 long-form answers across computer science, physics, neuroscience, and biomedicine. On ScholarQABench, OpenScholar-8B outperforms GPT-4o by 5% and PaperQA2 by 7% in correctness, despite being a smaller, open model. While GPT4o hallucinates citations 78 to 90% of the time, OpenScholar achieves citation accuracy on par with human experts. OpenScholar's datastore, retriever, and self-feedback inference loop also improves off-the-shelf LMs: for instance, OpenScholar-GPT4o improves GPT-4o's correctness by 12%. In human evaluations, experts preferred OpenScholar-8B and OpenScholar-GPT4o responses over expert-written ones 51% and 70% of the time, respectively, compared to GPT4o's 32%. We open-source all of our code, models, datastore, data and a public demo.

Penulis (25)

A

Akari Asai

J

Jacqueline He

R

Rulin Shao

W

Weijia Shi

A

Amanpreet Singh

J

Joseph Chee Chang

K

Kyle Lo

L

Luca Soldaini

S

Sergey Feldman

M

Mike D'arcy

D

David Wadden

M

Matt Latzke

M

Minyang Tian

P

Pan Ji

S

Shengyan Liu

H

Hao Tong

B

Bohao Wu

Y

Yanyu Xiong

L

Luke Zettlemoyer

G

Graham Neubig

D

Dan Weld

D

Doug Downey

W

Wen-tau Yih

P

Pang Wei Koh

H

Hannaneh Hajishirzi

Format Sitasi

Asai, A., He, J., Shao, R., Shi, W., Singh, A., Chang, J.C. et al. (2024). OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs. https://arxiv.org/abs/2411.14199

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓