Semantic Scholar Open Access 2019 204 sitasi

Semantics Disentangling for Text-To-Image Generation

Guojun Yin Bin Liu Lu Sheng Nenghai Yu Xiaogang Wang +1 lainnya

Lihat Sumber DOI

Abstrak

Synthesizing photo-realistic images from text descriptions is a challenging problem. Previous studies have shown remarkable progresses on visual quality of the generated images. In this paper, we consider semantics from the input text descriptions in helping render photo-realistic images. However, diverse linguistic expressions pose challenges in extracting consistent semantics even they depict the same thing. To this end, we propose a novel photo-realistic text-to-image generation model that implicitly disentangles semantics to both fulfill the high-level semantic consistency and low-level semantic diversity. To be specific, we design (1) a Siamese mechanism in the discriminator to learn consistent high-level semantics, and (2) a visual-semantic embedding strategy by semantic-conditioned batch normalization to find diverse low-level semantics. Extensive experiments and ablation studies on CUB and MS-COCO datasets demonstrate the superiority of the proposed method in comparison to state-of-the-art methods.

Topik & Kata Kunci

Computer Science

Penulis (6)

Guojun Yin

Bin Liu

Lu Sheng

Nenghai Yu

Xiaogang Wang

Jing Shao

Format Sitasi

APA MLA BibTeX

Yin, G., Liu, B., Sheng, L., Yu, N., Wang, X., Shao, J. (2019). Semantics Disentangling for Text-To-Image Generation. https://doi.org/10.1109/CVPR.2019.00243

Akses Cepat

Lihat di Sumber doi.org/10.1109/CVPR.2019.00243

Informasi Jurnal

Tahun Terbit: 2019
Bahasa: en
Total Sitasi: 204×
Sumber Database: Semantic Scholar
DOI: 10.1109/CVPR.2019.00243
Akses: Open Access ✓