arXiv Open Access 2024

TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models

Junlong Jia Ying Hu Xi Weng Yiming Shi Miao Li +6 lainnya
Lihat Sumber

Abstrak

We present TinyLLaVA Factory, an open-source modular codebase for small-scale large multimodal models (LMMs) with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results. Following the design philosophy of the factory pattern in software engineering, TinyLLaVA Factory modularizes the entire system into interchangeable components, with each component integrating a suite of cutting-edge models and methods, meanwhile leaving room for extensions to more features. In addition to allowing users to customize their own LMMs, TinyLLaVA Factory provides popular training recipes to let users pretrain and finetune their models with less coding effort. Empirical experiments validate the effectiveness of our codebase. The goal of TinyLLaVA Factory is to assist researchers and practitioners in exploring the wide landscape of designing and training small-scale LMMs with affordable computational resources.

Topik & Kata Kunci

Penulis (11)

J

Junlong Jia

Y

Ying Hu

X

Xi Weng

Y

Yiming Shi

M

Miao Li

X

Xingjian Zhang

B

Baichuan Zhou

Z

Ziyu Liu

J

Jie Luo

L

Lei Huang

J

Ji Wu

Format Sitasi

Jia, J., Hu, Y., Weng, X., Shi, Y., Li, M., Zhang, X. et al. (2024). TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models. https://arxiv.org/abs/2405.11788

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓