arXiv Open Access 2024

ChuXin: 1.6B Technical Report

Xiaomin Zhuang Yufan Jiang Qiaozhi He Zhihua Wu
Lihat Sumber

Abstrak

In this report, we present ChuXin, an entirely open-source language model with a size of 1.6 billion parameters. Unlike the majority of works that only open-sourced the model weights and architecture, we have made everything needed to train a model available, including the training data, the training process, and the evaluation code. Our goal is to empower and strengthen the open research community, fostering transparency and enabling a new wave of innovation in the field of language modeling. Furthermore, we extend the context length to 1M tokens through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. The weights for both models are available at Hugging Face to download and use.

Topik & Kata Kunci

Penulis (4)

X

Xiaomin Zhuang

Y

Yufan Jiang

Q

Qiaozhi He

Z

Zhihua Wu

Format Sitasi

Zhuang, X., Jiang, Y., He, Q., Wu, Z. (2024). ChuXin: 1.6B Technical Report. https://arxiv.org/abs/2405.04828

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓