arXiv Open Access 2026

LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

Chen Bo Calvin Zhang Christina Q. Knight Nicholas Kruus Jason Hausenloy Pedro Medeiros +14 lainnya
Lihat Sumber

Abstrak

Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to perform better than with internet-only resources. This uncertainty is central to understanding both scientific acceleration and dual-use risk. We conducted a multi-model, multi-benchmark human uplift study comparing novices with LLM access versus internet-only access across eight biosecurity-relevant task sets. Participants worked on complex problems with ample time (up to 13 hours for the most involved tasks). We found that LLM access provided substantial uplift: novices with LLMs were 4.16 times more accurate than controls (95% CI [2.63, 6.87]). On four benchmarks with available expert baselines (internet-only), novices with LLMs outperformed experts on three of them. Perhaps surprisingly, standalone LLMs often exceeded LLM-assisted novices, indicating that users were not eliciting the strongest available contributions from the LLMs. Most participants (89.6%) reported little difficulty obtaining dual-use-relevant information despite safeguards. Overall, LLMs substantially uplift novices on biological tasks previously reserved for trained practitioners, underscoring the need for sustained, interactive uplift evaluations alongside traditional benchmarks.

Penulis (19)

C

Chen Bo Calvin Zhang

C

Christina Q. Knight

N

Nicholas Kruus

J

Jason Hausenloy

P

Pedro Medeiros

N

Nathaniel Li

A

Aiden Kim

Y

Yury Orlovskiy

C

Coleman Breen

B

Bryce Cai

J

Jasper Götting

A

Andrew Bo Liu

S

Samira Nedungadi

P

Paula Rodriguez

Y

Yannis Yiming He

M

Mohamed Shaaban

Z

Zifan Wang

S

Seth Donoughe

J

Julian Michael

Format Sitasi

Zhang, C.B.C., Knight, C.Q., Kruus, N., Hausenloy, J., Medeiros, P., Li, N. et al. (2026). LLM Novice Uplift on Dual-Use, In Silico Biology Tasks. https://arxiv.org/abs/2602.23329

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓