Semantic Scholar Open Access 2023 40 sitasi

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Tuan Dung Nguyen Yuan-Sen Ting I. Ciucă Charlie O'Neill Ze-Chang Sun +19 lainnya

Abstrak

Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.

Topik & Kata Kunci

Penulis (24)

T

Tuan Dung Nguyen

Y

Yuan-Sen Ting

I

I. Ciucă

C

Charlie O'Neill

Z

Ze-Chang Sun

M

Maja Jabłońska

S

S. Kruk

E

Ernest Perkowski

J

Jack William Miller

J

Jason Li

J

J. Peek

K

Kartheik G. Iyer

T

Tomasz R'o.za'nski

P

P. Khetarpal

S

Sharaf Zaman

D

D. Brodrick

S

Sergio J. Rodr'iguez M'endez

T

Thang Bui

A

Alyssa Goodman

A

A. Accomazzi

J

J. P. Naiman

J

Jesse Cranney

K

K. Schawinski

U

UniverseTBD

Format Sitasi

Nguyen, T.D., Ting, Y., Ciucă, I., O'Neill, C., Sun, Z., Jabłońska, M. et al. (2023). AstroLLaMA: Towards Specialized Foundation Models in Astronomy. https://doi.org/10.48550/arXiv.2309.06126

Akses Cepat

Lihat di Sumber doi.org/10.48550/arXiv.2309.06126
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
40×
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2309.06126
Akses
Open Access ✓