arXiv Open Access 2024

Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

Satanu Ghosh Neal R. Brodnik Carolina Frey Collin Holgate Tresa M. Pollock +2 lainnya
Lihat Sumber

Abstrak

We explore the ability of GPT-4 to perform ad-hoc schema based information extraction from scientific literature. We assess specifically whether it can, with a basic prompting approach, replicate two existing material science datasets, given the manuscripts from which they were originally manually extracted. We employ materials scientists to perform a detailed manual error analysis to assess where the model struggles to faithfully extract the desired information, and draw on their insights to suggest research directions to address this broadly important task.

Penulis (7)

S

Satanu Ghosh

N

Neal R. Brodnik

C

Carolina Frey

C

Collin Holgate

T

Tresa M. Pollock

S

Samantha Daly

S

Samuel Carton

Format Sitasi

Ghosh, S., Brodnik, N.R., Frey, C., Holgate, C., Pollock, T.M., Daly, S. et al. (2024). Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets. https://arxiv.org/abs/2406.05348

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓