AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Abstrak
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
Topik & Kata Kunci
Penulis (24)
Tuan Dung Nguyen
Yuan-Sen Ting
I. Ciucă
Charlie O'Neill
Ze-Chang Sun
Maja Jabłońska
S. Kruk
Ernest Perkowski
Jack William Miller
Jason Li
J. Peek
Kartheik G. Iyer
Tomasz R'o.za'nski
P. Khetarpal
Sharaf Zaman
D. Brodrick
Sergio J. Rodr'iguez M'endez
Thang Bui
Alyssa Goodman
A. Accomazzi
J. P. Naiman
Jesse Cranney
K. Schawinski
UniverseTBD
Akses Cepat
- Tahun Terbit
- 2023
- Bahasa
- en
- Total Sitasi
- 40×
- Sumber Database
- Semantic Scholar
- DOI
- 10.48550/arXiv.2309.06126
- Akses
- Open Access ✓