arXiv Open Access 2024

ChemDFM-X: Towards Large Multimodal Model for Chemistry

Zihan Zhao Bo Chen Jingpiao Li Lu Chen Liyang Wen +8 lainnya
Lihat Sumber

Abstrak

Rapid developments of AI tools are expected to offer unprecedented assistance to the research of natural science including chemistry. However, neither existing unimodal task-specific specialist models nor emerging general large multimodal models (LMM) can cover the wide range of chemical data modality and task categories. To address the real demands of chemists, a cross-modal Chemical General Intelligence (CGI) system, which serves as a truly practical and useful research assistant utilizing the great potential of LMMs, is in great need. In this work, we introduce the first Cross-modal Dialogue Foundation Model for Chemistry (ChemDFM-X). Diverse multimodal data are generated from an initial modality by approximate calculations and task-specific model predictions. This strategy creates sufficient chemical training corpora, while significantly reducing excessive expense, resulting in an instruction-tuning dataset containing 7.6M data. After instruction finetuning, ChemDFM-X is evaluated on extensive experiments of different chemical tasks with various data modalities. The results demonstrate the capacity of ChemDFM-X for multimodal and inter-modal knowledge comprehension. ChemDFM-X marks a significant milestone toward aligning all modalities in chemistry, a step closer to CGI.

Topik & Kata Kunci

Penulis (13)

Z

Zihan Zhao

B

Bo Chen

J

Jingpiao Li

L

Lu Chen

L

Liyang Wen

P

Pengyu Wang

Z

Zichen Zhu

D

Danyang Zhang

Z

Ziping Wan

Y

Yansi Li

Z

Zhongyang Dai

X

Xin Chen

K

Kai Yu

Format Sitasi

Zhao, Z., Chen, B., Li, J., Chen, L., Wen, L., Wang, P. et al. (2024). ChemDFM-X: Towards Large Multimodal Model for Chemistry. https://arxiv.org/abs/2409.13194

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓