Hasil untuk "Visual arts"

Menampilkan 20 dari ~3377934 hasil · dari DOAJ, arXiv, Semantic Scholar, CrossRef

JSON API
arXiv Open Access 2026
Benchmarking Visual Feature Representations for LiDAR-Inertial-Visual Odometry Under Challenging Conditions

Eunseon Choi, Junwoo Hong, Daehan Lee et al.

Accurate localization in autonomous driving is critical for successful missions including environmental mapping and survivor searches. In visually challenging environments, including low-light conditions, overexposure, illumination changes, and high parallax, the performance of conventional visual odometry methods significantly degrade undermining robust robotic navigation. Researchers have recently proposed LiDAR-inertial-visual odometry (LIVO) frameworks, that integrate LiDAR, IMU, and camera sensors, to address these challenges. This paper extends the FAST-LIVO2-based framework by introducing a hybrid approach that integrates direct photometric methods with descriptor-based feature matching. For the descriptor-based feature matching, this work proposes pairs of ORB with the Hamming distance, SuperPoint with SuperGlue, SuperPoint with LightGlue, and XFeat with the mutual nearest neighbor. The proposed configurations are benchmarked by accuracy, computational cost, and feature tracking stability, enabling a quantitative comparison of the adaptability and applicability of visual descriptors. The experimental results reveal that the proposed hybrid approach outperforms the conventional sparse-direct method. Although the sparse-direct method often fails to converge in regions where photometric inconsistency arises due to illumination changes, the proposed approach still maintains robust performance under the same conditions. Furthermore, the hybrid approach with learning-based descriptors enables robust and reliable visual state estimation across challenging environments.

arXiv Open Access 2025
fCrit: A Visual Explanation System for Furniture Design Creative Support

Vuong Nguyen, Gabriel Vigliensoni

We introduce fCrit, a dialogue-based AI system designed to critique furniture design with a focus on explainability. Grounded in reflective learning and formal analysis, fCrit employs a multi-agent architecture informed by a structured design knowledge base. We argue that explainability in the arts should not only make AI reasoning transparent but also adapt to the ways users think and talk about their designs. We demonstrate how fCrit supports this process by tailoring explanations to users' design language and cognitive framing. This work contributes to Human-Centered Explainable AI (HCXAI) in creative practice, advancing domain-specific methods for situated, dialogic, and visually grounded AI support.

en cs.HC, cs.AI
DOAJ Open Access 2024
Un nuevo contrato cronosocial

Javier Bassas Vila, Raquel Friera

Este texto presenta el Instituto del Tiempo Suspendido, un pro-yecto artístico y filosófico que propone alternativas a la polí-tica del tiempo en nuestras sociedades. Para ello, utiliza dos estrategias principalmente. Por una parte, aprovecha el arte contemporáneo como una disciplina indisciplinada en la que se pueden armar espacios y tiempos no sometidos a la política neoliberal actual. Por otra parte, se apoya en la reflexión filo-sófica y política para contestar la relación entre poder y tiempo a través de conceptos como crononormatividad, cronodiver-sidad, régimen temporal o tiempo indeterminado. Al final del texto, se ofrece un enlace para realizar un “juicio temporal”.

Fine Arts, Visual arts
arXiv Open Access 2024
"Confrontation or Acceptance": Understanding Novice Visual Artists' Perception towards AI-assisted Art Creation

Shuning Zhang, Shixuan Li

The rise of Generative Artificial Intelligence (G-AI) has transformed the creative arts landscape by producing novel artwork, whereas in the same time raising ethical concerns. While previous studies have addressed these concerns from technical and societal viewpoints, there is a lack of discussion from an HCI perspective, especially considering the community's perception and the visual artists as human factors. Our study investigates G-AI's impact on visual artists and their relationship with GAI to inform HCI research. We conducted semi-structured interviews with 20 novice visual artists from an art college in the university with G-AI courses and practices. Our findings reveal (1) the mis-conception and the evolving adoption of visual artists, (2) the miscellaneous opinions of the society on visual artists' creative work, and (3) the co-existence of confrontation and collaboration between visual artists and G-AI. We explore future HCI research opportunities to address these issues.

en cs.HC
arXiv Open Access 2024
Are Colors Quanta of Light for Human Vision? A Quantum Cognition Study of Visual Perception

Jonito Aerts Arguëlles

We show that colors are light quanta for human visual perception in a similar way as photons are light quanta for physical measurements of light waves. Our result relies on the identification in the quantum measurement process itself of the warping mechanism which is characteristic of human perception. This warping mechanism makes stimuli classified into the same category perceived as more similar, while stimuli classified into different m categories are perceived as more different. In the quantum measurement process, the warping takes place between the pure states, which play the role played for human perception by the stimuli, and the density states after decoherence, which play the role played for human perception by the percepts. We use the natural metric for pure states, namely the normalized Fubini Study metric to measure distances between pure states, and the natural metric for density states, namely the normalized trace-class metric, to measure distances between density states. We then show that when pure states lie within a well-defined region surrounding an eigenstate, the quantum measurement, namely the process of decoherence, contracts the distance between these pure states, while the reverse happens for pure states lying in a well-defined region between two eigenstates, for which the quantum measurement causes a dilation. We elaborate as an example the situation of a two-dimensional quantum measurement described by the Bloch model and apply it to the situation of two colors 'Light' and 'Dark'. We argue that this analogy of warping, on the one hand in human perception and on the other hand in the quantum measurement process, makes colors to be quanta of light for human vision.

en q-bio.NC, cs.AI
arXiv Open Access 2024
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts

Yijia Xiao, Edward Sun, Tianyu Liu et al.

We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficiency in logical reasoning tasks, which are essential for activities like navigation and puzzle-solving. Thus we evaluate general logical cognition abilities across 5 logical reasoning tasks encompassing 9 different capabilities, using a sample of 448 multiple-choice questions. Each question is annotated with the correct answer and the human-written reasoning behind the selection, enabling both open-ended and multiple-choice evaluation. A total of 8 MLLMs are comprehensively evaluated using LogicVista. Code and Data Available at https://github.com/Yijia-Xiao/LogicVista.

en cs.AI, cs.CL
arXiv Open Access 2024
Epsilon-VAE: Denoising as Visual Decoding

Long Zhao, Sanghyun Woo, Ziyu Wan et al.

In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space. For high-dimensional visual data, it reduces redundancy and emphasizes key features for high-quality generation. Current visual tokenization methods rely on a traditional autoencoder framework, where the encoder compresses data into latent representations, and the decoder reconstructs the original input. In this work, we offer a new perspective by proposing denoising as decoding, shifting from single-step reconstruction to iterative refinement. Specifically, we replace the decoder with a diffusion process that iteratively refines noise to recover the original image, guided by the latents provided by the encoder. We evaluate our approach by assessing both reconstruction (rFID) and generation quality (FID), comparing it to state-of-the-art autoencoding approaches. By adopting iterative reconstruction through diffusion, our autoencoder, namely Epsilon-VAE, achieves high reconstruction quality, which in turn enhances downstream generation quality by 22% at the same compression rates or provides 2.3x inference speedup through increasing compression rates. We hope this work offers new insights into integrating iterative generation and autoencoding for improved compression and generation.

en cs.CV, cs.AI
arXiv Open Access 2024
Connections Beyond Data: Exploring Homophily With Visualizations

Poorna Talkad Sukumar, Maurizio Porfiri, Oded Nov

Homophily refers to the tendency of individuals to associate with others who are similar to them in characteristics, such as, race, ethnicity, age, gender, or interests. In this paper, we investigate if individuals exhibit racial homophily when viewing visualizations, using mass shooting data in the United States as the example topic. We conducted a crowdsourced experiment (N=450) where each participant was shown a visualization displaying the counts of mass shooting victims, highlighting the counts for one of three racial groups (White, Black, or Hispanic). Participants were assigned to view visualizations highlighting their own race or a different race to assess the influence of racial concordance on changes in affect (emotion) and attitude towards gun control. While we did not find evidence of homophily, the results showed a significant negative shift in affect across all visualization conditions. Notably, political ideology significantly impacted changes in affect, with more liberal views correlating with a more negative affect change. Our findings underscore the complexity of reactions to mass shooting visualizations and suggest that future research should consider various methodological improvements to better assess homophily effects.

arXiv Open Access 2024
Diffusion-Based Visual Art Creation: A Survey and New Perspectives

Bingyuan Wang, Qifeng Chen, Zeyu Wang

The integration of generative AI in visual art has revolutionized not only how visual content is created but also how AI interacts with and reflects the underlying domain knowledge. This survey explores the emerging realm of diffusion-based visual art creation, examining its development from both artistic and technical perspectives. We structure the survey into three phases, data feature and framework identification, detailed analyses using a structured coding process, and open-ended prospective outlooks. Our findings reveal how artistic requirements are transformed into technical challenges and highlight the design and application of diffusion-based methods within visual art creation. We also provide insights into future directions from technical and synergistic perspectives, suggesting that the confluence of generative AI and art has shifted the creative paradigm and opened up new possibilities. By summarizing the development and trends of this emerging interdisciplinary area, we aim to shed light on the mechanisms through which AI systems emulate and possibly, enhance human capacities in artistic perception and creativity.

en cs.AI, cs.CV
arXiv Open Access 2024
Replication in Visual Diffusion Models: A Survey and Outlook

Wenhao Wang, Yifan Sun, Zongxin Yang et al.

Visual diffusion models have revolutionized the field of creative AI, producing high-quality and diverse content. However, they inevitably memorize training images or videos, subsequently replicating their concepts, content, or styles during inference. This phenomenon raises significant concerns about privacy, security, and copyright within generated outputs. In this survey, we provide the first comprehensive review of replication in visual diffusion models, marking a novel contribution to the field by systematically categorizing the existing studies into unveiling, understanding, and mitigating this phenomenon. Specifically, unveiling mainly refers to the methods used to detect replication instances. Understanding involves analyzing the underlying mechanisms and factors that contribute to this phenomenon. Mitigation focuses on developing strategies to reduce or eliminate replication. Beyond these aspects, we also review papers focusing on its real-world influence. For instance, in the context of healthcare, replication is critically worrying due to privacy concerns related to patient data. Finally, the paper concludes with a discussion of the ongoing challenges, such as the difficulty in detecting and benchmarking replication, and outlines future directions including the development of more robust mitigation techniques. By synthesizing insights from diverse studies, this paper aims to equip researchers and practitioners with a deeper understanding at the intersection between AI technology and social good. We release this project at https://github.com/WangWenhao0716/Awesome-Diffusion-Replication.

en cs.CV, cs.AI
DOAJ Open Access 2023
Paveikslų ciklas Šv. Augustino gyvenimas

Rima Valinčiūtė-Varnė

Straipsnyje analizuojamas iki šiol netyrinėtas paveikslų ciklas Šv. Augustino gyvenimas. Jis atributuojamas, iškeliama hipotezė apie užsakovą. Kaip atskleidė tyrimas, šį ciklą kitados sudarė dešimt paveikslų, iš kurių keturi 2000 m. pateko į Kauno arkivyskupijos muziejų, vienas atrastas privačioje kolekcijoje, keturių buvimo vieta nežinoma, vienas dingęs negrįžtamai. Atvaizdus papildančios inskripcijos leido identifikuoti paveikslų ciklo grafinius pirmavaizdžius, vaizduojamas scenas, jų rašytinius šaltinius, nuosekliai išnagrinėti ikonografiją. Kadangi XX a. pirmoje pusėje šis paveikslų ciklas puošė Žemaičių kunigų seminarijos Didžiąją aulą Kaune, straipsnyje pirmą kartą identifikuojama šios 1895–1957 m. egzistavusios reprezentacinės patalpos tiksli buvimo vieta ir kaita, kartu su paveikslų ciklu užfiksuota archyvinėse nuotraukose.

Visual arts, History of the arts
arXiv Open Access 2023
What is in a Text-to-Image Prompt: The Potential of Stable Diffusion in Visual Arts Education

Nassim Dehouche, Kullathida Dehouche

Text-to-Image artificial intelligence (AI) recently saw a major breakthrough with the release of Dall-E and its open-source counterpart, Stable Diffusion. These programs allow anyone to create original visual art pieces by simply providing descriptions in natural language (prompts). Using a sample of 72,980 Stable Diffusion prompts, we propose a formalization of this new medium of art creation and assess its potential for teaching the history of art, aesthetics, and technique. Our findings indicate that text-to-Image AI has the potential to revolutionize the way art is taught, offering new, cost-effective possibilities for experimentation and expression. However, it also raises important questions about the ownership of artistic works. As more and more art is created using these programs, it will be crucial to establish new legal and economic models to protect the rights of artists.

en cs.HC
arXiv Open Access 2023
CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning

Yang Liu, Weixing Chen, Guanbin Li et al.

We present CausalVLR (Causal Visual-Linguistic Reasoning), an open-source toolbox containing a rich set of state-of-the-art causal relation discovery and causal inference methods for various visual-linguistic reasoning tasks, such as VQA, image/video captioning, medical report generation, model generalization and robustness, etc. These methods have been included in the toolbox with PyTorch implementations under NVIDIA computing system. It not only includes training and inference codes, but also provides model weights. We believe this toolbox is by far the most complete visual-linguitic causal reasoning toolbox. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to re-implement existing methods and develop their own new causal reasoning methods. Code and models are available at https://github.com/HCPLab-SYSU/CausalVLR. The project is under active development by HCP-Lab's contributors and we will keep this document updated.

en cs.CV
S2 Open Access 2022
The role of arts-based curricula in professional identity formation: results of a qualitative analysis of learner’s written reflections

James Aluri, J. Ker, Bonnie K. Marr et al.

ABSTRACT Background Professional identity formation is an important aspect of medical education that can be difficult to translate into formal curricula. The role of arts and humanities programs in fostering professional identity formation remains understudied. Analyzing learners’ written reflections, we explore the relationship between an arts-based course and themes of professional identity formation. Materials and methods Two cohorts of learners participated in a 5-day online course featuring visual arts-based group activities. Both cohorts responded to a prompt with written reflections at the beginning and end of the course. Using a thematic analysis method, we qualitatively analyzed one set of reflections from each cohort. Results Themes included the nature of the good life; fulfilling, purposeful work; entering the physician role; exploration of emotional experience; and personal growth. Reflections written at the end of the course engaged significantly with art – including literature, poetry, lyrics, and film. One student disclosed a mental illness in their reflection. Conclusions Our qualitative analysis of reflections written during a visual arts-based course found several themes related to professional identity formation. Such arts-based courses can also enrich learners’ reflections and provide a space for learners to be vulnerable. Practice Points (five short bullets conveying the main points) Arts-based courses can support learners’ professional identity formation Reflection themes related to professional identity formation included entering the physician role, fulfilling clinical work, and personal growth At the end of the course, learners’ reflections included significant engagement with art Reflective writing in small, arts-based learning communities can provide space for learners to be vulnerable The Role of Arts-Based Curricula in Professional Identity Formation: Results of A Qualitative Analysis of Learner’s Written Reflections

24 sitasi en Medicine
S2 Open Access 2021
The Value of Active Arts Engagement on Health and Well-Being of Older Adults: A Nation-Wide Participatory Study

B. Groot, L. de Kock, Yosheng Liu et al.

An emerging body of research indicates that active arts engagement can enhance older adults’ health and experienced well-being, but scientific evidence is still fragmented. There is a research gap in understanding arts engagement grounded in a multidimensional conceptualization of the value of health and well-being from older participants’ perspectives. This Dutch nation-wide study aimed to explore the broader value of arts engagement on older people’s perceived health and well-being in 18 participatory arts-based projects (dance, music, singing, theater, visual arts, video, and spoken word) for community-dwelling older adults and those living in long term care facilities. In this study, we followed a participatory design with narrative- and arts-based inquiry. We gathered micro-narratives from older people and their (in)formal caregivers (n = 470). The findings demonstrate that arts engagement, according to participants, resulted in (1) positive feelings, (2) personal and artistic growth, and (3) increased meaningful social interactions. This study concludes that art-based practices promote older people’s experienced well-being and increase the quality of life of older people. This study emphasizes the intrinsic value of arts engagement and has implications for research and evaluation of arts engagement.

45 sitasi en Medicine
DOAJ Open Access 2022
The Cult and Images of Saint Rose of Lima in Lithuania

Rūta Janonienė

Isabel Flores de Oliva (1586–1617), generally known as Saint Rose of Mary or Saint Rose of Lima, became the first and, until today, the most venerated Catholic saint of Latin America. Almost immediately after her death, this Dominican of the Third Order, who died young and won fame for her highly ascetic life, various virtues and immovable faith, developed a cult following in Peru. Soon enough the cult had spread across South America and Europe. The topic of Saint Rose of Lima has been addressed in a great many publications that appeared in various (primarily Spanish-speaking) countries, but in Lithuania, the devotion to and images of this saint have not received specific research attention. By referring to the published sources and manuscripts, as well as the surviving ecclesiastical artworks, the author of this paper aims to discuss in more detail how information about the life and personality of Saint Rose of Lima was disseminated in the Grand Duchy of Lithuania, what information about Latin America, its culture and people was conveyed in these sources, whether it was reflected in the icono- graphy of the saint, and if so, how.

Visual arts, History of the arts
arXiv Open Access 2022
Sound Adversarial Audio-Visual Navigation

Yinfeng Yu, Wenbing Huang, Fuchun Sun et al.

Audio-visual navigation task requires an agent to find a sound source in a realistic, unmapped 3D environment by utilizing egocentric audio-visual observations. Existing audio-visual navigation works assume a clean environment that solely contains the target sound, which, however, would not be suitable in most real-world applications due to the unexpected sound noise or intentional interference. In this work, we design an acoustically complex environment in which, besides the target sound, there exists a sound attacker playing a zero-sum game with the agent. More specifically, the attacker can move and change the volume and category of the sound to make the agent suffer from finding the sounding object while the agent tries to dodge the attack and navigate to the goal under the intervention. Under certain constraints to the attacker, we can improve the robustness of the agent towards unexpected sound attacks in audio-visual navigation. For better convergence, we develop a joint training mechanism by employing the property of a centralized critic with decentralized actors. Experiments on two real-world 3D scan datasets, Replica, and Matterport3D, verify the effectiveness and the robustness of the agent trained under our designed environment when transferred to the clean environment or the one containing sound attackers with random policy. Project: \url{https://yyf17.github.io/SAAVN}.

en cs.SD, cs.CV
arXiv Open Access 2022
Visual-based Positioning and Pose Estimation

Somnuk Phon-Amnuaisuk, Ken T. Murata, La-Or Kovavisaruch et al.

Recent advances in deep learning and computer vision offer an excellent opportunity to investigate high-level visual analysis tasks such as human localization and human pose estimation. Although the performance of human localization and human pose estimation has significantly improved in recent reports, they are not perfect and erroneous localization and pose estimation can be expected among video frames. Studies on the integration of these techniques into a generic pipeline that is robust to noise introduced from those errors are still lacking. This paper fills the missing study. We explored and developed two working pipelines that suited the visual-based positioning and pose estimation tasks. Analyses of the proposed pipelines were conducted on a badminton game. We showed that the concept of tracking by detection could work well, and errors in position and pose could be effectively handled by a linear interpolation technique using information from nearby frames. The results showed that the Visual-based Positioning and Pose Estimation could deliver position and pose estimations with good spatial and temporal resolutions.

en cs.CV
arXiv Open Access 2022
1st Place Solution to NeurIPS 2022 Challenge on Visual Domain Adaptation

Daehan Kim, Minseok Seo, YoungJin Jeon et al.

The Visual Domain Adaptation(VisDA) 2022 Challenge calls for an unsupervised domain adaptive model in semantic segmentation tasks for industrial waste sorting. In this paper, we introduce the SIA_Adapt method, which incorporates several methods for domain adaptive models. The core of our method in the transferable representation from large-scale pre-training. In this process, we choose a network architecture that differs from the state-of-the-art for domain adaptation. After that, self-training using pseudo-labels helps to make the initial adaptation model more adaptable to the target domain. Finally, the model soup scheme helped to improve the generalization performance in the target domain. Our method SIA_Adapt achieves 1st place in the VisDA2022 challenge. The code is available on https: //github.com/DaehanKim-Korea/VisDA2022_Winner_Solution.

en cs.CV, cs.AI

Halaman 9 dari 168897