arXiv Open Access 2024

Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation

Yu Zhang Ruijie Yu Kaipeng Zeng Ding Li Feng Zhu +3 lainnya
Lihat Sumber

Abstrak

Identifying reaction conditions that are broadly applicable across diverse substrates is a longstanding challenge in chemical and pharmaceutical research. While many methods are available to generate conditions with acceptable performance, a universal approach for reliably discovering effective conditions during reaction exploration is rare. Consequently, current reaction optimization processes are often labor-intensive, time-consuming, and costly, relying heavily on trial-and-error experimentation. Nowadays, large language models (LLMs) are capable of tackling chemistry-related problems, such as molecule design and chemical reasoning tasks. Here, we report the design, implementation and application of Chemma-RC, a text-augmented multimodal LLM to identify effective conditions through task-specific dialogue and condition generation. Chemma-RC learns a unified representation of chemical reactions by aligning multiple modalities-including text corpus, reaction SMILES, and reaction graphs-within a shared embedding module. Performance benchmarking on datasets showed high precision in identifying optimal conditions, with up to 17% improvement over the current state-of-the-art methods. A palladium-catalysed imidazole C-H arylation reaction was investigated experimentally to evaluate the functionalities of the Chemma-RC in practice. Our findings suggest that Chemma-RC holds significant potential to accelerate high-throughput condition screening in chemical synthesis.

Penulis (8)

Y

Yu Zhang

R

Ruijie Yu

K

Kaipeng Zeng

D

Ding Li

F

Feng Zhu

X

Xiaokang Yang

Y

Yaohui Jin

Y

Yanyan Xu

Format Sitasi

Zhang, Y., Yu, R., Zeng, K., Li, D., Zhu, F., Yang, X. et al. (2024). Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation. https://arxiv.org/abs/2407.15141

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓