Expression
P. Ekman
Expression is a term that has been applied to music in an extraordinarily wide range of senses; philosophers and musicologists alike remain sharply divided on what musical expression is and how it works. This essay gives a historical account of the ways in which music in and of itself—that is, music independent of words—has been regarded as expressive over the past three hundred years by surveying influences spanning Enlightenment theories of rhetoric and the imitation of emotions, Romantic aesthetics of subjectivity and self-expression, and more contemporary versions of contour, arousal, persona, and intransitive theories of expression. This review reveals the lack of a single satisfactory explanatory framework for musical expression, even while confirming musical expression’s enduring importance as “the soul of music.”
1105 sitasi
en
Psychology
Reliability and validity testing of the Michigan Hand Outcomes Questionnaire.
K. Chung, Matthew S. Pillsbury, M. Walters
et al.
Old wine in new bottles or novel challenges: a critical analysis of empirical studies of user experience
Javier A. Bargas-Avila, Kasper Hornbæk
691 sitasi
en
Computer Science
Spatio-temporal Evolution Characteristics and Multi-scenario Simulation of Carbon Sink Spatial Patterns in Towns of Southern Jiangsu
Lingyun FAN, Yuxuan TANG, Yongbing TIAN
ObjectiveCollaboratively promoting carbon reduction, pollution reduction, green expansion, and growth, while maintaining national ecological security, has become a key focus area in national strategic planning in recent years. However, rapid urbanization has compressed carbon sink spaces such as forest land and grassland, leading to a significant decline in environmental quality and soil carbon sink capacity. Currently, existing research on carbon sink spaces is limited, and it is mostly concentrated on regional scales with superior ecological environments and rich vegetation cover. Research on rapidly urbanizing areas with poor carbon sink backgrounds is relatively scarce. Therefore, analyzing the spatio-temporal evolution characteristics of carbon sinks in highly urbanized areas with weak carbon sink backgrounds and conducting multi scenario simulation analysis. To provide a basis for optimizing the spatial layout of the country and formulating differentiated carbon sink enhancement strategies, thus contributing to maintaining regional ecological security and achieving high-quality development.MethodsThis study focuses on southern Jiangsu region, where urbanization is predominant and carbon sink spaces face intense competition with construction spaces. At the township scale, the carbon sink space is analyzed and classified using specific criteria. The PLUS (patch-generating land use simulation) model is used to analyze the spatio-temporal evolution characteristics of carbon sink space from 2000 to 2020, and proposes differentiated strategies based on simulation results of various future development scenarios.ResultsThis study focuses on the town carbon sink space in rapidly urbanizing areas, revealing that the evolution of carbon sink space in rapidly urbanizing areas is the result of the combined effects of natural factors, policy interventions, and town development stages. It has important theoretical and practical value for optimizing the national spatial pattern and achieving carbon neutrality goals, providing scientific support for the green transformation of new urbanization in developed areas. The research indicates four results. 1) From 2000 to 2020, the loss of carbon sink spaces in southern Jiangsu region was not uniform but highly concentrated in high-value carbon sink areas. 2) The structure of carbon sink spaces in southern Jiangsu region at the town scale did not completely disintegrate due to urbanization; instead, it demonstrated remarkable stability. 3) Simulation results show that different intensities of carbon sink protection measures can promote the expansion of high-quality carbon sink spaces. However, a "carbon sink enhancement scenario" is not necessarily optimal. The pursuit of a "high carbon sink coefficient" alone should be avoided, and the risk of ecological function simplification needs to be guarded against. 4) Towns in southern Jiangsu region can be categorized into three types: those with high carbon sink capacity, high carbon sink potential, and high construction intensity. Most towns have maintained their original carbon sink spatial structure characteristics under three simulated scenarios, and in the future, they can focus on exploring the potential of existing space to protect and optimize carbon sink space. For sensitive town types—those with easily fluctuating carbon sink quality, those prone to carbon sink function degradation, and those with clearly degraded carbon sink functions—more targeted strategies should be implemented based on the specific risk types.ConclusionThrough multi scenario simulation, the evolution patterns of future urban carbon sink spaces can be analyzed and predicted, offering references for the protection and optimization of urban carbon sink spaces in rapidly urbanizing areas. This study can scientifically analyze the dynamic evolution laws of regional carbon sink space, explore the optimization path and has significant theoretical and practical value for optimizing territorial spatial patterns and achieving carbon neutrality goals, thus providing scientific support for the green transformation of new urbanization. This method can be widely applied to similar studies on town ecological space planning related to carbon sink enhancement, and helps other cities, especially those with rapid urbanization, to achieve coordinated and sustainable development of ecological environment and economy.
Aesthetics of cities. City planning and beautifying, Architectural drawing and design
The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry
Anna Marklová, Ondřej Vinš, Martina Vokáčová
et al.
Large language models are increasingly capable of producing creative texts, yet most studies on AI-generated poetry focus on English -- a language that dominates training data. In this paper, we examine the perception of AI- and human-written Czech poetry. We ask if Czech native speakers are able to identify it and how they aesthetically judge it. Participants performed at chance level when guessing authorship (45.8\% correct on average), indicating that Czech AI-generated poems were largely indistinguishable from human-written ones. Aesthetic evaluations revealed a strong authorship bias: when participants believed a poem was AI-generated, they rated it as less favorably, even though AI poems were in fact rated equally or more favorably than human ones on average. The logistic regression model uncovered that the more the people liked a poem, the less probable was that they accurately assign the authorship. Familiarity with poetry or literary background had no effect on recognition accuracy. Our findings show that AI can convincingly produce poetry even in a morphologically complex, low-resource (with respect of the training data of AI models) Slavic language such as Czech. The results suggest that readers' beliefs about authorship and the aesthetic evaluation of the poem are interconnected.
Imprinto: Enhancing Infrared Inkjet Watermarking for Human and Machine Perception
Martin Feick, Xuxin Tang, Raul Garcia-Martin
et al.
Hybrid paper interfaces leverage augmented reality to combine the desired tangibility of paper documents with the affordances of interactive digital media. Typically, virtual content can be embedded through direct links (e.g., QR codes); however, this impacts the aesthetics of the paper print and limits the available visual content space. To address this problem, we present Imprinto, an infrared inkjet watermarking technique that allows for invisible content embeddings only by using off-the-shelf IR inks and a camera. Imprinto was established through a psychophysical experiment, studying how much IR ink can be used while remaining invisible to users regardless of background color. We demonstrate that we can detect invisible IR content through our machine learning pipeline, and we developed an authoring tool that optimizes the amount of IR ink on the color regions of an input document for machine and human detectability. Finally, we demonstrate several applications, including augmenting paper documents and objects.
A Survey on Ordinal Regression: Applications, Advances and Prospects
Jinhong Wang, Jintai Chen, Jian Liu
et al.
Ordinal regression refers to classifying object instances into ordinal categories. Ordinal regression is crucial for applications in various areas like facial age estimation, image aesthetics assessment, and even cancer staging, due to its capability to utilize ordered information effectively. More importantly, it also enhances model interpretation by considering category order, aiding the understanding of data trends and causal relationships. Despite significant recent progress, challenges remain, and further investigation of ordinal regression techniques and applications is essential to guide future research. In this survey, we present a comprehensive examination of advances and applications of ordinal regression. By introducing a systematic taxonomy, we meticulously classify the pertinent techniques and applications into three well-defined categories based on different strategies and objectives: Continuous Space Discretization, Distribution Ordering Learning, and Ambiguous Instance Delving. This categorization enables a structured exploration of diverse insights in ordinal regression problems, providing a framework for a more comprehensive understanding and evaluation of this field and its related applications. To our best knowledge, this is the first systematic survey of ordinal regression, which lays a foundation for future research in this fundamental and generic domain.
Track, Inpaint, Resplat: Subject-driven 3D and 4D Generation with Progressive Texture Infilling
Shuhong Zheng, Ashkan Mirzaei, Igor Gilitschenski
Current 3D/4D generation methods are usually optimized for photorealism, efficiency, and aesthetics. However, they often fail to preserve the semantic identity of the subject across different viewpoints. Adapting generation methods with one or few images of a specific subject (also known as Personalization or Subject-driven generation) allows generating visual content that align with the identity of the subject. However, personalized 3D/4D generation is still largely underexplored. In this work, we introduce TIRE (Track, Inpaint, REsplat), a novel method for subject-driven 3D/4D generation. It takes an initial 3D asset produced by an existing 3D generative model as input and uses video tracking to identify the regions that need to be modified. Then, we adopt a subject-driven 2D inpainting model for progressively infilling the identified regions. Finally, we resplat the modified 2D multi-view observations back to 3D while still maintaining consistency. Extensive experiments demonstrate that our approach significantly improves identity preservation in 3D/4D generation compared to state-of-the-art methods. Our project website is available at https://zsh2000.github.io/track-inpaint-resplat.github.io/.
Structured Captions Improve Prompt Adherence in Text-to-Image Models (Re-LAION-Caption 19M)
Nicholas Merchant, Haitz Sáez de Ocáriz Borde, Andrei Cristian Popescu
et al.
We argue that generative text-to-image models often struggle with prompt adherence due to the noisy and unstructured nature of large-scale datasets like LAION-5B. This forces users to rely heavily on prompt engineering to elicit desirable outputs. In this work, we propose that enforcing a consistent caption structure during training can significantly improve model controllability and alignment. We introduce Re-LAION-Caption 19M, a high-quality subset of Re-LAION-5B, comprising 19 million 1024x1024 images with captions generated by a Mistral 7B Instruct-based LLaVA-Next model. Each caption follows a four-part template: subject, setting, aesthetics, and camera details. We fine-tune PixArt-$Σ$ and Stable Diffusion 2 using both structured and randomly shuffled captions, and show that structured versions consistently yield higher text-image alignment scores using visual question answering (VQA) models. The dataset is publicly available at https://huggingface.co/datasets/supermodelresearch/Re-LAION-Caption19M.
Homenagem à professora Staël de Alvarenga Pereira Costa
Heraldo Ferreira Borges, Gisela Barcellos de Souza, Maria Cristina Villefort Teixeira
et al.
Aesthetics of cities. City planning and beautifying, Urban groups. The city. Urban sociology
Strange Case of the Missing Assistant
Roger Emmerson
This essay has its origin in issues related to the alleged failure in 2018 of the Royal Incorporated of Architects in Scotland (RIAS), a chapter of the Royal Institute of British Architects (RIBA), to comply with the requirements of its Royal Charter. This engendered a wider inquiry into the operations and legislation affecting architecture, enacted by and on the RIBA since its founding in 1834, in the promulgation and regulation of the interrelated processes of architectural education, registration and practice in the United Kingdom (UK). Pivotal moments in the late nineteenth-century debate on the "professional or artist-architect", the enactment of the Registration Acts of the 1930s, the 1958 Oxford Conference on Architectural Education, the Monopolies legislation of the 1970s, the 1997 Registration Act, and the 2003 European Union Directive amended in 2015 are all put to the test by the contemporaneous structures and procedures of the office and the architects work. Texts on and by architects relating to education, registration and practice, as well as the various reports made by the RIBA, the UK Architects Registration Council (ARCUK) and subsequently the Architects Registration Board (ARB), are referenced and their impact on the architectural profession is assessed. The essay will seek to demonstrate that decisions made and directions chosen at those pivotal points and the lack of understanding of the links between them have left endemic structural flaws in education, registration and practice in the UK unresolved.
Aesthetics of cities. City planning and beautifying, Anthropology
Towards Holistic Language-video Representation: the language model-enhanced MSR-Video to Text Dataset
Yuchen Yang, Yingxuan Duan
A more robust and holistic language-video representation is the key to pushing video understanding forward. Despite the improvement in training strategies, the quality of the language-video dataset is less attention to. The current plain and simple text descriptions and the visual-only focus for the language-video tasks result in a limited capacity in real-world natural language video retrieval tasks where queries are much more complex. This paper introduces a method to automatically enhance video-language datasets, making them more modality and context-aware for more sophisticated representation learning needs, hence helping all downstream tasks. Our multifaceted video captioning method captures entities, actions, speech transcripts, aesthetics, and emotional cues, providing detailed and correlating information from the text side to the video side for training. We also develop an agent-like strategy using language models to generate high-quality, factual textual descriptions, reducing human intervention and enabling scalability. The method's effectiveness in improving language-video representation is evaluated through text-video retrieval using the MSR-VTT dataset and several multi-modal retrieval models.
Machine Apophenia: The Kaleidoscopic Generation of Architectural Images
Alexey Tikhonov, Dmitry Sinyavin
This study investigates the application of generative artificial intelligence in architectural design. We present a novel methodology that combines multiple neural networks to create an unsupervised and unmoderated stream of unique architectural images. Our approach is grounded in the conceptual framework called machine apophenia. We hypothesize that neural networks, trained on diverse human-generated data, internalize aesthetic preferences and tend to produce coherent designs even from random inputs. The methodology involves an iterative process of image generation, description, and refinement, resulting in captioned architectural postcards automatically shared on several social media platforms. Evaluation and ablation studies show the improvement both in technical and aesthetic metrics of resulting images on each step.
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Hossein Aboutalebi, Hwanjun Song, Yusheng Xie
et al.
Development of multimodal interactive systems is hindered by the lack of rich, multimodal (text, images) conversational data, which is needed in large quantities for LLMs. Previous approaches augment textual dialogues with retrieved images, posing privacy, diversity, and quality constraints. In this work, we introduce Multimodal Augmented Generative Images Dialogues (MAGID), a framework to augment text-only dialogues with diverse and high-quality images. Subsequently, a diffusion model is applied to craft corresponding images, ensuring alignment with the identified text. Finally, MAGID incorporates an innovative feedback loop between an image description generation module (textual LLM) and image quality modules (addressing aesthetics, image-text matching, and safety), that work in tandem to generate high-quality and multi-modal dialogues. We compare MAGID to other SOTA baselines on three dialogue datasets, using automated and human evaluation. Our results show that MAGID is comparable to or better than baselines, with significant improvements in human evaluation, especially against retrieval baselines where the image database is small.
Learning Multi-dimensional Human Preference for Text-to-Image Generation
Sixian Zhang, Bohan Wang, Junqiang Wu
et al.
Current metrics for text-to-image models typically rely on statistical metrics which inadequately represent the real preference of humans. Although recent work attempts to learn these preferences via human annotated images, they reduce the rich tapestry of human preference to a single overall score. However, the preference results vary when humans evaluate images with different aspects. Therefore, to learn the multi-dimensional human preferences, we propose the Multi-dimensional Preference Score (MPS), the first multi-dimensional preference scoring model for the evaluation of text-to-image models. The MPS introduces the preference condition module upon CLIP model to learn these diverse preferences. It is trained based on our Multi-dimensional Human Preference (MHP) Dataset, which comprises 918,315 human preference choices across four dimensions (i.e., aesthetics, semantic alignment, detail quality and overall assessment) on 607,541 images. The images are generated by a wide range of latest text-to-image models. The MPS outperforms existing scoring methods across 3 datasets in 4 dimensions, enabling it a promising metric for evaluating and improving text-to-image generation.
DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation
Jiapeng Wang, Chengyu Wang, Tingfeng Cao
et al.
We present DiffChat, a novel method to align Large Language Models (LLMs) to "chat" with prompt-as-input Text-to-Image Synthesis (TIS) models (e.g., Stable Diffusion) for interactive image creation. Given a raw prompt/image and a user-specified instruction, DiffChat can effectively make appropriate modifications and generate the target prompt, which can be leveraged to create the target image of high quality. To achieve this, we first collect an instruction-following prompt engineering dataset named InstructPE for the supervised training of DiffChat. Next, we propose a reinforcement learning framework with the feedback of three core criteria for image creation, i.e., aesthetics, user preference, and content integrity. It involves an action-space dynamic modification technique to obtain more relevant positive samples and harder negative samples during the off-policy sampling. Content integrity is also introduced into the value estimation function for further improvement of produced images. Our method can exhibit superior performance than baseline models and strong competitors based on both automatic and human evaluations, which fully demonstrates its effectiveness.
A Translation Analysis of Kahlil Gibran’s “The Broken Wings” to “Sayap-Sayap Patah” by Sapardi Djoko Damono and M. Ruslan Shiddieq
Tira Nur Fitria
This research analyzes the translation between two translators in translating Kahlil Gibran’s work entitled “The Broken Wings” into ‘Sayap-Sayap Patah”. This research is descriptive qualitative. The analysis shows that the two translators have different styles of translating. The first translator, Sapardi Djoko Damono, chose a more formal and direct style in his translation, while the second translator, M. Ruslan Shiddieq, tended to use a more expressive style and prioritized artistic impressions. Sapardi Djoko Damono maintained fidelity to the original text by translating it literally, while M. Ruslan Shiddieq carried out free interpretation and created more metaphorical and creative sentences. These differences in approach result in translations that have different nuances and expressions, reflecting the translator's style and preferences. Despite their differences, both translators, Sapardi Djoko Damono and M. Ruslan Shiddieq, effectively convey the essence of Kahlil Gibran's "The Broken Wings" in Indonesian, albeit through distinct stylistic lenses. To be a proficient translator of literary works, one must possess a mastery of both the source and target languages, a deep understanding of literature, a keen sense of aesthetics, and a strong sense of literature. Literature with its emotional depth and linguistic beauty resonates deeply with readers Skilled translators like Sapardi Djoko Damono and M. Ruslan Shiddieq bring forth its lyrical and profound qualities in their translations. Overall, the translation of literary works requires not only linguistic proficiency but also creative skill and cultural sensitivity. Each translator brings their unique style and approach, shaping the reader's experience of the translated work. As readers, we can appreciate and explore the diverse interpretations offered by different translations, enriching our understanding and enjoyment of literature in translation.
Abstrak
Penelitian ini menganalisis penerjemahan antara dua orang penerjemah dalam menerjemahkan karya Kahlil Gibran yang berjudul “The Broken Wings” ke dalam ‘Sayap-Sayap Patah’. Penelitian ini bersifat deskriptif kualitatif. Hasil analisis menunjukkan bahwa kedua penerjemah mempunyai gaya penerjemahan yang berbeda. Penerjemah pertama, Sapardi Djoko Damono, memilih gaya yang lebih formal dan langsung dalam penerjemahannya, sedangkan penerjemah kedua, M. Ruslan Shiddieq, cenderung menggunakan gaya yang lebih ekspresif dan mengutamakan kesan artistik. Sapardi Djoko Damono menjaga kesetiaan pada teks aslinya dengan menerjemahkannya secara harfiah, sedangkan M. Ruslan Shiddieq melakukan interpretasi bebas dan menciptakan kalimat yang lebih metaforis dan kreatif. Perbedaan pendekatan ini menghasilkan terjemahan yang mempunyai nuansa dan ekspresi berbeda yang mencerminkan gaya dan preferensi penerjemah. Meski berbeda, kedua penerjemah, Sapardi Djoko Damono dan M. Ruslan Shiddieq efektif menyampaikan esensi “Sayap Patah” karya Kahlil Gibran dalam bahasa Indonesia, meski melalui lensa stilistika yang berbeda. Untuk menjadi seorang penerjemah karya sastra yang mahir, seseorang harus memiliki penguasaan bahasa sumber dan bahasa sasaran, pemahaman yang mendalam tentang sastra, rasa estetika yang tajam, dan rasa sastra yang kuat. Sastra dengan kedalaman emosional dan keindahan linguistiknya sangat disukai pembaca. Penerjemah terampil seperti Sapardi Djoko Damono dan M. Ruslan Shiddieq menonjolkan kualitas liris dan mendalam dalam terjemahannya. Secara keseluruhan, penerjemahan karya sastra tidak hanya memerlukan kemahiran linguistik tetapi juga keterampilan kreatif dan kepekaan budaya. Setiap penerjemah menghadirkan gaya dan pendekatan uniknya masing-masing, yang membentuk pengalaman pembaca terhadap karya terjemahan. Sebagai pembaca, kita dapat mengapresiasi dan mengeksplorasi beragam penafsiran yang ditawarkan oleh berbagai terjemahan sehingga memperkaya pemahaman dan kenikmatan kita terhadap karya sastra dalam terjemahan.
Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models with Generative AI
Faraz Faruqi, Ahmed Katary, Tarik Hasic
et al.
With recent advances in Generative AI, it is becoming easier to automatically manipulate 3D models. However, current methods tend to apply edits to models globally, which risks compromising the intended functionality of the 3D model when fabricated in the physical world. For example, modifying functional segments in 3D models, such as the base of a vase, could break the original functionality of the model, thus causing the vase to fall over. We introduce a method for automatically segmenting 3D models into functional and aesthetic elements. This method allows users to selectively modify aesthetic segments of 3D models, without affecting the functional segments. To develop this method we first create a taxonomy of functionality in 3D models by qualitatively analyzing 1000 models sourced from a popular 3D printing repository, Thingiverse. With this taxonomy, we develop a semi-automatic classification method to decompose 3D models into functional and aesthetic elements. We propose a system called Style2Fab that allows users to selectively stylize 3D models without compromising their functionality. We evaluate the effectiveness of our classification method compared to human-annotated data, and demonstrate the utility of Style2Fab with a user study to show that functionality-aware segmentation helps preserve model functionality.
Compliant actuators that mimic biological muscle performance with applications in a highly biomimetic robotic arm
Haosen Yang, Guowu Wei, Lei Ren
et al.
This paper endeavours to bridge the existing gap in muscular actuator design for ligament-skeletal-inspired robots, thereby fostering the evolution of these robotic systems. We introduce two novel compliant actuators, namely the Internal Torsion Spring Compliant Actuator (ICA) and the External Spring Compliant Actuator (ECA), and present a comparative analysis against the previously conceived Magnet Integrated Soft Actuator (MISA) through computational and experimental results. These actuators, employing a motor-tendon system, emulate biological muscle-like forms, enhancing artificial muscle technology. A robotic arm application inspired by the skeletal ligament system is presented. Experiments demonstrate satisfactory power in tasks like lifting dumbbells (peak power: 36W), playing table tennis (end-effector speed: 3.2 m/s), and door opening, without compromising biomimetic aesthetics. Compared to other linear stiffness serial elastic actuators (SEAs), ECA and ICA exhibit high power-to-volume (361 x 10^3 W/m) and power-to-mass (111.6 W/kg) ratios respectively, endorsing the biomimetic design's promise in robotic development.
Art and the science of generative AI: A deeper dive
Ziv Epstein, Aaron Hertzmann, Laura Herman
et al.
A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of society. Understanding the impact of generative AI - and making policy decisions around it - requires new interdisciplinary scientific inquiry into culture, economics, law, algorithms, and the interaction of technology and creativity. We argue that generative AI is not the harbinger of art's demise, but rather is a new medium with its own distinct affordances. In this vein, we consider the impacts of this new medium on creators across four themes: aesthetics and culture, legal questions of ownership and credit, the future of creative work, and impacts on the contemporary media ecosystem. Across these themes, we highlight key research questions and directions to inform policy and beneficial uses of the technology.