Semantic Scholar Open Access 2023 42 sitasi

AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment

Xiangfei Sheng Leida Li Pengfei Chen Jinjian Wu W. Dong +4 lainnya

Abstrak

Image aesthetics assessment (IAA) aims at predicting the aesthetic quality of images. Recently, large pre-trained vision-language models, like CLIP, have shown impressive performances on various visual tasks. When it comes to IAA, a straightforward way is to finetune the CLIP image encoder using aesthetic images. However, this can only achieve limited success without considering the uniqueness of multimodal data in the aesthetics domain. People usually assess image aesthetics according to fine-grained visual attributes, e.g., color, light and composition. However, how to learn aesthetics-aware attributes from CLIP-based semantic space has not been addressed before. With this motivation, this paper presents a CLIP-based multi-attribute contrastive learning framework for IAA, dubbed AesCLIP. Specifically, AesCLIP consists of two major components, i.e., aesthetic attribute-based comment classification and attribute-aware learning. The former classifies the aesthetic comments into different attribute categories. Then the latter learns an aesthetic attribute-aware representation by contrastive learning, aiming to mitigate the domain shift from the general visual domain to the aesthetics domain. Extensive experiments have been done by using the pre-trained AesCLIP on four popular IAA databases, and the results demonstrate the advantage of AesCLIP over the state-of-the-arts. The source code will be public at https://github.com/OPPOMKLab/AesCLIP.

Topik & Kata Kunci

Penulis (9)

X

Xiangfei Sheng

L

Leida Li

P

Pengfei Chen

J

Jinjian Wu

W

W. Dong

Y

Yuzhe Yang

L

Liwu Xu

Y

Yaqian Li

G

Guangming Shi

Format Sitasi

Sheng, X., Li, L., Chen, P., Wu, J., Dong, W., Yang, Y. et al. (2023). AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment. https://doi.org/10.1145/3581783.3611969

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1145/3581783.3611969
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Total Sitasi
42×
Sumber Database
Semantic Scholar
DOI
10.1145/3581783.3611969
Akses
Open Access ✓