arXiv Open Access 2024

GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts

Junwen He Yifan Wang Lijun Wang Huchuan Lu Jun-Yan He +5 lainnya
Lihat Sumber

Abstrak

Text logo design heavily relies on the creativity and expertise of professional designers, in which arranging element layouts is one of the most important procedures. However, this specific task has received limited attention, often overshadowed by broader layout generation tasks such as document or poster design. In this paper, we propose a Vision-Language Model (VLM)-based framework that generates content-aware text logo layouts by integrating multi-modal inputs with user-defined constraints, enabling more flexible and robust layout generation for real-world applications. We introduce two model techniques that reduce the computational cost for processing multiple glyph images simultaneously, without compromising performance. To support instruction tuning of our model, we construct two extensive text logo datasets that are five times larger than existing public datasets. In addition to geometric annotations (\textit{e.g.}, text masks and character recognition), our datasets include detailed layout descriptions in natural language, enabling the model to reason more effectively in handling complex designs and custom user inputs. Experimental results demonstrate the effectiveness of our proposed framework and datasets, outperforming existing methods on various benchmarks that assess geometric aesthetics and human preferences.

Topik & Kata Kunci

Penulis (10)

J

Junwen He

Y

Yifan Wang

L

Lijun Wang

H

Huchuan Lu

J

Jun-Yan He

C

Chenyang Li

H

Hanyuan Chen

J

Jin-Peng Lan

B

Bin Luo

Y

Yifeng Geng

Format Sitasi

He, J., Wang, Y., Wang, L., Lu, H., He, J., Li, C. et al. (2024). GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts. https://arxiv.org/abs/2411.11435

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓