Enhancing Image Aesthetics with Dual-Conditioned Diffusion Models Guided by Multimodal Perception
Xinyu Nan, Ning Wang, Yuyao Zhai
et al.
Image aesthetic enhancement aims to perceive aesthetic deficiencies in images and perform corresponding editing operations, which is highly challenging and requires the model to possess creativity and aesthetic perception capabilities. Although recent advancements in image editing models have significantly enhanced their controllability and flexibility, they struggle with enhancing image aesthetic. The primary challenges are twofold: first, following editing instructions with aesthetic perception is difficult, and second, there is a scarcity of "perfectly-paired" images that have consistent content but distinct aesthetic qualities. In this paper, we propose Dual-supervised Image Aesthetic Enhancement (DIAE), a diffusion-based generative model with multimodal aesthetic perception. First, DIAE incorporates Multimodal Aesthetic Perception (MAP) to convert the ambiguous aesthetic instruction into explicit guidance by (i) employing detailed, standardized aesthetic instructions across multiple aesthetic attributes, and (ii) utilizing multimodal control signals derived from text-image pairs that maintain consistency within the same aesthetic attribute. Second, to mitigate the lack of "perfectly-paired" images, we collect "imperfectly-paired" dataset called IIAEData, consisting of images with varying aesthetic qualities while sharing identical semantics. To better leverage the weak matching characteristics of IIAEData during training, a dual-branch supervision framework is also introduced for weakly supervised image aesthetic enhancement. Experimental results demonstrate that DIAE outperforms the baselines and obtains superior image aesthetic scores and image content consistency scores.
Behind the counter, behind the discourse: The paradox of pharmacist influence in Arabic women's health online
Samar J. Melhem, Hamzeh Almomani, Rimal Mousa
et al.
Background: Social media is now a major arena for Arabic women's health discourse in the MENA region, yet it is unclear how pharmacists' expertise influences both the accuracy and visibility of information across platforms. Objective: To compare pharmacists' visibility and accuracy with other author groups and to assess how platform, sentiment, and follower dynamics shape the gap between information quality and reach. Methods: We conducted a cross-sectional content analysis of 682 public Arabic-language posts on women's self-medication and over-the-counter care from Instagram, YouTube, TikTok, Threads, Facebook, and X (January 2024–March 2025). Two independent coders rated accuracy on a four-point scale and classified sentiment (κ > 0.80). Engagement was summarized using the Virtual Presence Index (VPI), an equally weighted composite of standardized likes, comments, and shares/reposts. Proportional-odds ordinal logistic regression modeled predictors of higher accuracy; a non-circular binary logistic model examined determinants of high engagement (above-median VPI) with platform, author type, sentiment, topic, and linear plus quadratic log₁₀(follower count) as covariates. Results: Pharmacists authored 49.6 % of posts; physicians and other health professionals contributed 37.1 %. Overall, 71.8 % of posts were rated accurate, rising to 94.1 % for pharmacist-authored content. Platform was the strongest predictor of accuracy: compared with Instagram, Facebook, YouTube, Threads, and X had higher odds of higher accuracy, with TikTok showing a smaller but significant advantage. Pharmacist authorship independently predicted higher accuracy, whereas follower count did not. For engagement, platform dominated. With X as the reference, all other platforms had lower adjusted odds of high VPI. Positive sentiment increased the likelihood of high VPI, and follower count showed a U-shaped association, with mid-sized accounts disadvantaged. After adjustment, author-type differences in visibility were modest: pharmacists' posts were more accurate but did not enjoy consistent visibility advantages, especially on highly visual, fast-scroll platforms. Conclusion: In Arabic women's health discourse online, who speaks matters less for reach than where and how they speak. Pharmacists deliver the most accurate content but often remain “invisible experts” in environments that reward aesthetics and emotion over credentials. The VPI helps quantify this quality–reach gap and can guide platform-specific, culturally attuned strategies to make evidence-based voices more discoverable.
Pharmacy and materia medica
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
Songping Wang, Yueming Lyu, Shiqi Liu
et al.
The rise of customized diffusion models has spurred a boom in personalized visual content creation, but also poses risks of malicious misuse, severely threatening personal privacy and copyright protection. Some studies show that the aesthetic properties of images are highly positively correlated with human perception of image quality. Inspired by this, we approach the problem from a novel and intriguing aesthetic perspective to degrade the generation quality of maliciously customized models, thereby achieving better protection of facial identity. Specifically, we propose a Hierarchical Anti-Aesthetic (HAA) framework to fully explore aesthetic cues, which consists of two key branches: 1) Global Anti-Aesthetics: By establishing a global anti-aesthetic reward mechanism and a global anti-aesthetic loss, it can degrade the overall aesthetics of the generated content; 2) Local Anti-Aesthetics: A local anti-aesthetic reward mechanism and a local anti-aesthetic loss are designed to guide adversarial perturbations to disrupt local facial identity. By seamlessly integrating both branches, our HAA effectively achieves the goal of anti-aesthetics from a global to a local level during customized generation. Extensive experiments show that HAA outperforms existing SOTA methods largely in identity removal, providing a powerful tool for protecting facial privacy and copyright.
LAPIS: A novel dataset for personalized image aesthetic assessment
Anne-Sofie Maerten, Li-Wei Chen, Stefanie De Winter
et al.
We present the Leuven Art Personalized Image Set (LAPIS), a novel dataset for personalized image aesthetic assessment (PIAA). It is the first dataset with images of artworks that is suitable for PIAA. LAPIS consists of 11,723 images and was meticulously curated in collaboration with art historians. Each image has an aesthetics score and a set of image attributes known to relate to aesthetic appreciation. Besides rich image attributes, LAPIS offers rich personal attributes of each annotator. We implemented two existing state-of-the-art PIAA models and assessed their performance on LAPIS. We assess the contribution of personal attributes and image attributes through ablation studies and find that performance deteriorates when certain personal and image attributes are removed. An analysis of failure cases reveals that both existing models make similar incorrect predictions, highlighting the need for improvements in artistic image aesthetic assessment. The LAPIS project page can be found at: https://github.com/Anne-SofieMaerten/LAPIS
Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance
Lisha Li, Jingwen Hou, Weide Liu
et al.
Facial Aesthetics Enhancement (FAE) aims to improve facial attractiveness by adjusting the structure and appearance of a facial image while preserving its identity as much as possible. Most existing methods adopted deep feature-based or score-based guidance for generation models to conduct FAE. Although these methods achieved promising results, they potentially produced excessively beautified results with lower identity consistency or insufficiently improved facial attractiveness. To enhance facial aesthetics with less loss of identity, we propose the Nearest Neighbor Structure Guidance based on Diffusion (NNSG-Diffusion), a diffusion-based FAE method that beautifies a 2D facial image with 3D structure guidance. Specifically, we propose to extract FAE guidance from a nearest neighbor reference face. To allow for less change of facial structures in the FAE process, a 3D face model is recovered by referring to both the matched 2D reference face and the 2D input face, so that the depth and contour guidance can be extracted from the 3D face model. Then the depth and contour clues can provide effective guidance to Stable Diffusion with ControlNet for FAE. Extensive experiments demonstrate that our method is superior to previous relevant methods in enhancing facial aesthetics while preserving facial identity.
Modeling Aesthetic Preferences in 3D Shapes: A Large-Scale Paired Comparison Study Across Object Categories
Kapil Dev
Human aesthetic preferences for 3D shapes are central to industrial design, virtual reality, and consumer product development. However, most computational models of 3D aesthetics lack empirical grounding in large-scale human judgments, limiting their practical relevance. We present a large-scale study of human preferences. We collected 22,301 pairwise comparisons across five object categories (chairs, tables, mugs, lamps, and dining chairs) via Amazon Mechanical Turk. Building on a previously published dataset~\cite{dev2020learning}, we introduce new non-linear modeling and cross-category analysis to uncover the geometric drivers of aesthetic preference. We apply the Bradley-Terry model to infer latent aesthetic scores and use Random Forests with SHAP analysis to identify and interpret the most influential geometric features (e.g., symmetry, curvature, compactness). Our cross-category analysis reveals both universal principles and domain-specific trends in aesthetic preferences. We focus on human interpretable geometric features to ensure model transparency and actionable design insights, rather than relying on black-box deep learning approaches. Our findings bridge computational aesthetics and cognitive science, providing practical guidance for designers and a publicly available dataset to support reproducibility. This work advances the understanding of 3D shape aesthetics through a human-centric, data-driven framework.
Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment
Henglin Liu, Nisha Huang, Chang Liu
et al.
The aesthetic quality assessment task is crucial for developing a human-aligned quantitative evaluation system for AIGC. However, its inherently complex nature, spanning visual perception, cognition, and emotion, poses fundamental challenges. Although aesthetic descriptions offer a viable representation of this complexity, two critical challenges persist: (1) data scarcity and imbalance: existing dataset overly focuses on visual perception and neglects deeper dimensions due to the expensive manual annotation; and (2) model fragmentation: current visual networks isolate aesthetic attributes with multi-branch encoder, while multimodal methods represented by contrastive learning struggle to effectively process long-form textual descriptions. To resolve challenge (1), we first present the Refined Aesthetic Description (RAD) dataset, a large-scale (70k), multi-dimensional structured dataset, generated via an iterative pipeline without heavy annotation costs and easy to scale. To address challenge (2), we propose ArtQuant, an aesthetics assessment framework for artistic images which not only couples isolated aesthetic dimensions through joint description generation, but also better models long-text semantics with the help of LLM decoders. Besides, theoretical analysis confirms this symbiosis: RAD's semantic adequacy (data) and generation paradigm (model) collectively minimize prediction entropy, providing mathematical grounding for the framework. Our approach achieves state-of-the-art performance on several datasets while requiring only 33% of conventional training epochs, narrowing the cognitive gap between artistic images and aesthetic judgment. We will release both code and dataset to support future research.
A Relational (Re)Turn: Revisit Interactive Art through Interaction and Aesthetics
Aven-Le Zhou
This paper revisits the concept of interaction in interactive art, tracing its evolution from sociocultural origins to its narrowing within human-computer paradigms. It critiques this reduction and proposes a relational (re)turn through reclaiming interaction as intersubjective and relational. Through a synthesis of aesthetic theories and case studies from Ars Electronica, the paper introduces Techno Relational Aesthetics, a new conceptual lens that emphasizes technologically mediated relationality. This approach expands interactive art beyond audience-artwork interaction and opens the possibility to broader relational practices.
Salar Mameni (2023). Terracene: A Crude Aesthetics
Jasmine Sanau
Motion pictures, Philosophy (General)
Prolegomena to a Post-Aesthetics of Artificial Imaginations
Philippe Boisnard
The acceleration of the use of generative artificial intelligences (AI), since 2015 and the turning point operated by Deepdream, tends to obscure a real analysis of what could be defined as artificial imagination. AIs are either reduced to simple instruments or thought of according to a form of techno-theologism. Our research tends to suspend any form of judgment in order to phenomenally grasp the emergence of these AIs. By taking up the question of Hegel's aesthetics and of art as the free production of the mind, but by moving it towards the question of generative AIs and therefore of a post-aesthetics, this article will show the phenomenal specificity of images generatedby AI.
APDDv2: Aesthetics of Paintings and Drawings Dataset with Artist Labeled Scores and Comments
Xin Jin, Qianqian Qiao, Yi Lu
et al.
Datasets play a pivotal role in training visual models, facilitating the development of abstract understandings of visual features through diverse image samples and multidimensional attributes. However, in the realm of aesthetic evaluation of artistic images, datasets remain relatively scarce. Existing painting datasets are often characterized by limited scoring dimensions and insufficient annotations, thereby constraining the advancement and application of automatic aesthetic evaluation methods in the domain of painting. To bridge this gap, we introduce the Aesthetics Paintings and Drawings Dataset (APDD), the first comprehensive collection of paintings encompassing 24 distinct artistic categories and 10 aesthetic attributes. Building upon the initial release of APDDv1, our ongoing research has identified opportunities for enhancement in data scale and annotation precision. Consequently, APDDv2 boasts an expanded image corpus and improved annotation quality, featuring detailed language comments to better cater to the needs of both researchers and practitioners seeking high-quality painting datasets. Furthermore, we present an updated version of the Art Assessment Network for Specific Painting Styles, denoted as ArtCLIP. Experimental validation demonstrates the superior performance of this revised model in the realm of aesthetic evaluation, surpassing its predecessor in accuracy and efficacy. The dataset and model are available at https://github.com/BestiVictory/APDDv2.git.
Reflecting on beauty: the aesthetics of mathematical discovery
Filip D. Jevtić, Jovana Kostić, Katarina Maksimović
Mathematical research is often motivated by the desire to reach a beautiful result or to prove it in an elegant way. Mathematician's work is thus strongly influenced by his aesthetic judgments. However, the criteria these judgments are based on remain unclear. In this article, we focus on the concept of mathematical beauty, as one of the central aesthetic concepts in mathematics. We argue that beauty in mathematics reveals connections between apparently non-related problems or areas and allows a better and wider insight into mathematical reality as a whole. We also explain the close relationship between beauty and other important notions such as depth, elegance, simplicity, fruitfulness, and others.
Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation
Guangyang Wu, Xiaohong Liu, Jun Jia
et al.
In the digital era, QR codes serve as a linchpin connecting virtual and physical realms. Their pervasive integration across various applications highlights the demand for aesthetically pleasing codes without compromised scannability. However, prevailing methods grapple with the intrinsic challenge of balancing customization and scannability. Notably, stable-diffusion models have ushered in an epoch of high-quality, customizable content generation. This paper introduces Text2QR, a pioneering approach leveraging these advancements to address a fundamental challenge: concurrently achieving user-defined aesthetics and scanning robustness. To ensure stable generation of aesthetic QR codes, we introduce the QR Aesthetic Blueprint (QAB) module, generating a blueprint image exerting control over the entire generation process. Subsequently, the Scannability Enhancing Latent Refinement (SELR) process refines the output iteratively in the latent space, enhancing scanning robustness. This approach harnesses the potent generation capabilities of stable-diffusion models, navigating the trade-off between image aesthetics and QR code scannability. Our experiments demonstrate the seamless fusion of visual appeal with the practical utility of aesthetic QR codes, markedly outperforming prior methods. Codes are available at \url{https://github.com/mulns/Text2QR}
Full-thickness skin graft versus split-thickness skin graft for radial forearm free flap donor site closure: protocol for a systematic review and meta-analysis
Jasper J.E. Moors, Zhibin Xu, Kunpeng Xie
et al.
Abstract Background The radial forearm free flap (RFFF) serves as a workhorse for a variety of reconstructions. Although there are a variety of surgical techniques for donor site closure after RFFF raising, the most common techniques are closure using a split-thickness skin graft (STSG) or a full-thickness skin graft (FTSG). The closure can result in wound complications and function and aesthetic compromise of the forearm and hand. The aim of the planned systematic review and meta-analysis is to compare the wound-related, function-related and aesthetics-related outcome associated with full-thickness skin grafts (FTSG) and split-thickness skin grafts (STSG) in radial forearm free flap (RFFF) donor site closure. Methods A systematic review and meta-analysis will be conducted. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines will be followed. Electronic databases and platforms (PubMed, Embase, Scopus, Web of Science, Cochrane Central Register of Controlled Trials (CENTRAL), China National Knowledge Infrastructure (CNKI)) and clinical trial registries (ClinicalTrials.gov, the German Clinical Trials Register, the ISRCTN registry, the International Clinical Trials Registry Platform) will be searched using predefined search terms until 15 January 2024. A rerun of the search will be carried out within 12 months before publication of the review. Eligible studies should report on the occurrence of donor site complications after raising an RFFF and closure of the defect. Included closure techniques are techniques that use full-thickness skin grafts and split-thickness skin grafts. Excluded techniques for closure are primary wound closure without the use of skin graft. Outcomes are considered wound-, functional-, and aesthetics-related. Studies that will be included are randomized controlled trials (RCTs) and prospective and retrospective comparative cohort studies. Case-control studies, studies without a control group, animal studies and cadaveric studies will be excluded. Screening will be performed in a blinded fashion by two reviewers per study. A third reviewer resolves discrepancies. The risk of bias in the original studies will be assessed using the ROBINS-I and RoB 2 tools. Data synthesis will be done using Review Manager (RevMan) 5.4.1. If appropriate, a meta-analysis will be conducted. Between-study variability will be assessed using the I2 index. If necessary, R will be used. The quality of evidence for outcomes will eventually be assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach. Discussion This study's findings may help us understand both closure techniques' complication rates and may have important implications for developing future guidelines for RFFF donor site management. If available data is limited and several questions remain unanswered, additional comparative studies will be needed. Systematic review registration The protocol was developed in line with the PRISMA-P extension for protocols and was registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 17 September 2023 (registration number CRD42023351903).
The Influence of Post-Internet on the Aesthetics of Vaporwave
Vygintas Orlovas
While the cultural phenomenon known as vaporwave is commonly traced back to 2009, its defining characteristics remain a subject of ongoing debate within both popular culture and academic circles. Various perspectives categorize it as a microgenre of electronic music, a meme, an art movement, a critique of capitalism, or even a manifestation of pure aesthetics. As such, vaporwave remains a complex and multifaceted topic for research.
In this paper, I explore the influence of post-internet culture on the formation of vaporwave and its aesthetics by analyzing the methods and strategies used to create what is recognized as vaporwave, rather than attempting to label or define it precisely. As a further step in this inquiry, I document an attempt to apply these methods and strategies, resulting in the publication of four music albums. This practice-based approach to analyzing vaporwave through creation and publication helps to better understand some core qualities and aesthetics of this art movement.
Visual arts, History of the arts
Impressions: Understanding Visual Semiotics and Aesthetic Impact
Julia Kruk, Caleb Ziems, Diyi Yang
Is aesthetic impact different from beauty? Is visual salience a reflection of its capacity for effective communication? We present Impressions, a novel dataset through which to investigate the semiotics of images, and how specific visual features and design choices can elicit specific emotions, thoughts and beliefs. We posit that the impactfulness of an image extends beyond formal definitions of aesthetics, to its success as a communicative act, where style contributes as much to meaning formation as the subject matter. However, prior image captioning datasets are not designed to empower state-of-the-art architectures to model potential human impressions or interpretations of images. To fill this gap, we design an annotation task heavily inspired by image analysis techniques in the Visual Arts to collect 1,440 image-caption pairs and 4,320 unique annotations exploring impact, pragmatic image description, impressions, and aesthetic design choices. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. However, this dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
Multi-task convolutional neural network for image aesthetic assessment
Derya Soydaner, Johan Wagemans
As people's aesthetic preferences for images are far from understood, image aesthetic assessment is a challenging artificial intelligence task. The range of factors underlying this task is almost unlimited, but we know that some aesthetic attributes affect those preferences. In this study, we present a multi-task convolutional neural network that takes into account these attributes. The proposed neural network jointly learns the attributes along with the overall aesthetic scores of images. This multi-task learning framework allows for effective generalization through the utilization of shared representations. Our experiments demonstrate that the proposed method outperforms the state-of-the-art approaches in predicting overall aesthetic scores for images in one benchmark of image aesthetics. We achieve near-human performance in terms of overall aesthetic scores when considering the Spearman's rank correlations. Moreover, our model pioneers the application of multi-tasking in another benchmark, serving as a new baseline for future research. Notably, our approach achieves this performance while using fewer parameters compared to existing multi-task neural networks in the literature, and consequently makes our method more efficient in terms of computational complexity.
Health professionals’ and beauty therapists’ perspectives on female genital cosmetic surgery: an interview study
Maggie Kirkman, Amy Dobson, Karalyn McDonald
et al.
Abstract Background Female genital cosmetic surgery (FGCS) changes the structure and appearance of healthy external genitalia. We aimed to identify discourses that help explain and rationalise FGCS and to derive from them possibilities for informing clinical education. Methods We interviewed 16 health professionals and 5 non-health professionals who deal with women’s bodies using a study-specific semi-structured interview guide. We analysed transcripts using a three-step iterative process: identifying themes relevant to indications for FGCS, identifying the discourses within which they were positioned, and categorising and theorising discourses. Results We identified discourses that we categorised within four themes: Diversity and the Normal Vulva (diversity was both acknowledged and rejected); Indications for FGCS (Functional, Psychological, Appearance); Ethical Perspectives; and Reasons Women Seek FGCS (Pubic Depilation, Media Representation, Pornography, Advertising Regulations, Social Pressure, Genital Unfamiliarity). Conclusions Vulvar aesthetics constitute a social construct to which medical practice and opinion contribute and by which they are influenced; education and reform need to occur on all fronts. Resources that not only establish genital diversity but also challenge limited vulvar aesthetics could be developed in consultation with women, healthcare practitioners, mental health specialists, and others with knowledge of social constructs of women’s bodies.
Gynecology and obstetrics, Public aspects of medicine
Derek Jarman’s <i>Tempest</i>, William Shakespeare’s <i>Salò</i>
Tomas Elliott
This article re-evaluates Derek Jarman’s adaptation of William Shakespeare’s <i>The Tempest</i> (1979) based on archival research into the cinematic and historical intertexts that influenced the film. Specifically, it focuses on the impact of Pier Paolo Pasolini on Jarman’s aesthetics, particularly the Italian filmmaker’s last work: <i>Salò, or the 120 Days of Sodom</i> (1975). The article explores how Jarman used Pasolini’s work as a filter through which to frame his adaptation of Shakespeare’s play. In so doing, he produced a decidedly Pasolinian twist on <i>The Tempest</i>, which he explicitly referred to in his notes as “Shakespeare’s <i>Salò</i>.” Bridging the gap between the Renaissance and Jarman’s contemporary moment, Jarman’s film offers a meditation on ideas of captivity and captivation in <i>The Tempest</i>, which extends from the play and film’s literal representations of imprisonment to their exploration of the affective power of performance and spectacle.
History of scholarship and learning. The humanities
Usability and Aesthetics: Better Together for Automated Repair of Web Pages
Thanh Le-Cong, Xuan Bach D. Le, Quyet-Thang Huynh
et al.
With the recent explosive growth of mobile devices such as smartphones or tablets, guaranteeing consistent web appearance across all environments has become a significant problem. This happens simply because it is hard to keep track of the web appearance on different sizes and types of devices that render the web pages. Therefore, fixing the inconsistent appearance of web pages can be difficult, and the cost incurred can be huge, e.g., poor user experience and financial loss due to it. Recently, automated web repair techniques have been proposed to automatically resolve inconsistent web page appearance, focusing on improving usability. However, generated patches tend to disrupt the webpage's layout, rendering the repaired webpage aesthetically unpleasing, e.g., distorted images or misalignment of components. In this paper, we propose an automated repair approach for web pages based on meta-heuristic algorithms that can assure both usability and aesthetics. The key novelty that empowers our approach is a novel fitness function that allows us to optimistically evolve buggy web pages to find the best solution that optimizes both usability and aesthetics at the same time. Empirical evaluations show that our approach is able to successfully resolve mobile-friendly problems in 94% of the evaluation subjects, significantly outperforming state-of-the-art baseline techniques in terms of both usability and aesthetics.