Semantic Scholar Open Access 2023 962 sitasi

Baichuan 2: Open Large-scale Language Models

Ai Ming Yang Bin Xiao Bingning Wang Borong Zhang Ce Bian +50 lainnya

Abstrak

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.

Topik & Kata Kunci

Penulis (55)

A

Ai Ming Yang

B

Bin Xiao

B

Bingning Wang

B

Borong Zhang

C

Ce Bian

C

Chao Yin

C

Chenxu Lv

D

Da Pan

D

Dian Wang

D

Dong Yan

F

Fan Yang

F

Fei Deng

F

Feng Wang

F

Feng Liu

G

Guangwei Ai

G

Guosheng Dong

H

Hai Zhao

H

Hang Xu

H

Hao-Lun Sun

H

Hongda Zhang

H

Hui Liu

J

Jiaming Ji

J

Jian Xie

J

Juntao Dai

K

Kuncheng Fang

L

Lei Su

L

Liang Song

L

Lifeng Liu

L

Liyun Ru

L

Luyao Ma

M

Mang Wang

M

Mickel Liu

M

Mingan Lin

N

Nuolan Nie

P

Pei Guo

R

Ruiyang Sun

Z

Zhang Tao

T

Tianpeng Li

T

Tianyu Li

W

Wei Cheng

W

Weipeng Chen

X

Xiangrong Zeng

X

Xiaochuan Wang

X

Xiaoxi Chen

X

Xin Men

X

Xing Yu

X

Xuehai Pan

Y

Yan-Bin Shen

Y

Yiding Wang

Y

Yiyun Li

Y

Youxin Jiang

Y

Yuchen Gao

Y

Yupeng Zhang

Z

Zenan Zhou

Z

Zhiying Wu

Format Sitasi

Yang, A.M., Xiao, B., Wang, B., Zhang, B., Bian, C., Yin, C. et al. (2023). Baichuan 2: Open Large-scale Language Models. https://doi.org/10.48550/arXiv.2309.10305

Akses Cepat

Lihat di Sumber doi.org/10.48550/arXiv.2309.10305
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
962×
Sumber Database
Semantic Scholar
DOI
10.48550/arXiv.2309.10305
Akses
Open Access ✓