arXiv Open Access 2024

Training-Free Style Consistent Image Synthesis with Condition and Mask Guidance in E-Commerce

Guandong Li

Lihat Sumber

Abstrak

Generating style-consistent images is a common task in the e-commerce field, and current methods are largely based on diffusion models, which have achieved excellent results. This paper introduces the concept of the QKV (query/key/value) level, referring to modifications in the attention maps (self-attention and cross-attention) when integrating UNet with image conditions. Without disrupting the product's main composition in e-commerce images, we aim to use a train-free method guided by pre-set conditions. This involves using shared KV to enhance similarity in cross-attention and generating mask guidance from the attention map to cleverly direct the generation of style-consistent images. Our method has shown promising results in practical applications.

Topik & Kata Kunci

cs.CV

Penulis (1)

Guandong Li

Format Sitasi

APA MLA BibTeX

Li, G. (2024). Training-Free Style Consistent Image Synthesis with Condition and Mask Guidance in E-Commerce. https://arxiv.org/abs/2409.04750

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓