Scott S. Wiltermuth, C. Heath
Hasil untuk "Dancing"
Menampilkan 20 dari ~199828 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar
Ademola Gbenga Eluwole
Africa’s global image continues to be shaped by deficit narratives that foreground insecurity, poverty and crisis, while comparable or greater violence and structural fragilities in Western societies rarely redefine their global brand. This asymmetry reflects inequalities in narrative power and media ownership rather than objective reality (Hall 1997; Nwobodo 2025). In response, African cultural producers are increasingly turning to popular culture to tell alternative stories. This article examines how Nigerian popular music, especially Afrobeats and related genres functions as a tool for rebranding Africa by producing cultural counter-narratives and generating soft power. Drawing on theories of representation, postcolonial self-narration, nation branding and soft power, the article advances a conceptual argument supported by illustrative case examples, including Burna Boy’s Ye and African Giant, Wizkid and Tems’ Essence, and Rema’s Calm Down. This article explores recent studies on Afrobeats' global popularity, how it represents African values, and how it challenges stereotypes (Akombo 2019; Ashibel 2023; Obasi & Msughter 2023; Serang 2024; Yusif 2024). It argues that Nigerian music serves as a form of “sonic diplomacy,” offering images of African modernity, creativity, and resilience. These images challenge the narrow, crisis-driven view of Africa often portrayed in the media (Adichie 2009). However, the article also points out significant issues, such as the influence of class, gender portrayal, and the possibility of oversimplifying Africa’s complexity. In conclusion, it stresses the need for a stronger connection between the music industry, cultural policies, and media initiatives led by Africans. It suggests that while Nigerian music alone may not be enough, it can play an important role in reshaping Africa’s global image and strengthening the continent’s cultural influence.
Sangjune Park, Inhyeok Choi, Donghyeon Soon et al.
Dance is a form of human motion characterized by emotional expression and communication, playing a role in various fields such as music, virtual reality, and content creation. Existing methods for dance generation often fail to adequately capture the inherently sequential, rhythmical, and music-synchronized characteristics of dance. In this paper, we propose \emph{MambaDance}, a new dance generation approach that leverages a Mamba-based diffusion model. Mamba, well-suited to handling long and autoregressive sequences, is integrated into our two-stage diffusion architecture, substituting off-the-shelf Transformer. Additionally, considering the critical role of musical beats in dance choreography, we propose a Gaussian-based beat representation to explicitly guide the decoding of dance sequences. Experiments on AIST++ and FineDance datasets for each sequence length show that our proposed method effectively generates plausible dance movements while reflecting essential characteristics, consistently from short to long dances, compared to the previous methods. Additional qualitative results and demo videos are available at \small{https://vision3d-lab.github.io/mambadance}.
Jaekwon Im, Natalia Polouliakh, Taketo Akama
Dance-to-music generation aims to generate music that is aligned with dance movements. Existing approaches typically rely on body motion features extracted from a single human dancer and limited dance-to-music datasets, which restrict their performance and applicability to real-world scenarios involving multiple dancers and non-human dancers. In this paper, we propose PF-D2M, a universal diffusion-based dance-to-music generation model that incorporates visual features extracted from dance videos. PF-D2M is trained with a progressive training strategy that effectively addresses data scarcity and generalization challenges. Both objective and subjective evaluations show that PF-D2M achieves state-of-the-art performance in dance-music alignment and music quality.
Anna Püschel
This article gives insight into my artistic research project Stimming a Space, which explores “stimming”—auto‐regulative behaviour—as a means to make and hold space for neurodivergent individuals within the art world. The umbrella term “neurodiversity” describes developmental conditions such as autism, ADHD, or dyspraxia. Neurodivergent individuals stim extensively due to frequently occurring sensory issues. I argue that parallel to movements of “queering” public spaces that result in increasing safety for all gender identities, “cripping” spaces through adjusting them to neurodivergent needs can be beneficial to everyone in a competitive capitalist environment such as the art world: from education to art spaces and academia that host an increasing number of artistic researchers. Diversity in the art world is not a luxury but a need. Despite recent motions for inclusion, disabled artists still encounter “ableism,” othering, and exclusion. Lack of diversity perpetuates stereotypes and mental obstacles. From an “emic” perspective, the research project Stimming a Space approaches neurodiversity as a disability affecting the entire body instead of solely focusing on symptoms such as speech impairment or executive dysfunction. As a counterweight to much literature that problematises stimming as “disruptive behaviour,” this autoethnographic research approaches it as a performative tool and claims that exploring the entire “bodymind” and embracing stimming as a radical act of self‐care can enrich current research on neurodiversity. Opening up the art world is not a mere act of solidarity—lived inclusion makes it more accessible and safer for everyone.
Philippe Colantoni, Rafique Ahmed, Prashant Ghimire et al.
The accuracy and efficiency of human body pose estimation depend on the quality of the data to be processed and of the particularities of these data. To demonstrate how dance videos can challenge pose estimation techniques, we proposed a new 3D human body pose estimation pipeline which combined up-to-date techniques and methods that had not been yet used in dance analysis. Second, we performed tests and extensive experimentations from dance video archives, and used visual analytic tools to evaluate the impact of several data parameters on human body pose. Our results are publicly available for research at https://www.couleur.org/articles/arXiv-1-2025/
Hyunyoung Han, Jongwon Jang, Kitaeg Shim et al.
We propose AfforDance, an augmented reality (AR)-based dance learning system that generates personalized learning content and enhances learning through visual affordances. Our system converts user-selected dance videos into interactive learning experiences by integrating 3D reference avatars, audio synchronization, and adaptive visual cues that guide movement execution. This work contributes to personalized dance education by offering an adaptable, user-centered learning interface.
Sebeesh Jacob
The cultural renaissance in 20th-century India has fostered an aesthetic integration of contemplative mysticism with popular religious practices, influencing various artistic and theological movements. This paper examines Christian artist Joy Elamkunnapuzha’s use of Indian classical and mythical elements in his religious artworks, particularly in two North Indian churches. These intercultural icons, which incorporate symbols from Hindu traditions like mandalas and mudras, have been central to the worship practices of local Catholic communities for over three decades. Through ritual ethnography, the study reveals how these visual representations mediate ritual affectivity and communal imagination, impacting identity formation and spiritual engagement in a multi-religious context. Respondents—including worshippers, ministers, and religious students—attest to the transformative impact of these images, as they negotiate between Christian metaphors and Hindu aesthetic traditions. The research is grounded in practical theology, liturgical theology, and ritual studies, contributing to the works of Indian Christian cultural activists like Jyoti Sahi. By exploring the creative dynamics of visual approach, visual appeal, and visual affinity within worship spaces, the study elucidates the complex processes of meaning making through symbolic mediation in interreligious environments.
Zhizhen Zhou, Yejing Huo, Guoheng Huang et al.
The study of music-generated dance is a novel and challenging Image generation task. It aims to input a piece of music and seed motions, then generate natural dance movements for the subsequent music. Transformer-based methods face challenges in time series prediction tasks related to human movements and music due to their struggle in capturing the nonlinear relationship and temporal aspects. This can lead to issues like joint deformation, role deviation, floating, and inconsistencies in dance movements generated in response to the music. In this paper, we propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective, which consists of a Spin Position Embedding (SPE) module and a Quaternion Rotary Attention (QRA) module. First, SPE embeds position information into self-attention in a rotational manner, leading to better learning of features of movement sequences and audio sequences, and improved understanding of the connection between music and dance. Second, QRA represents and fuses 3D motion features and audio features in the form of a series of quaternions, enabling the model to better learn the temporal coordination of music and dance under the complex temporal cycle conditions of dance generation. Finally, we conducted experiments on the dataset AIST++, and the results show that our approach achieves better and more robust performance in generating accurate, high-quality dance movements. Our source code and dataset can be available from https://github.com/MarasyZZ/QEAN and https://google.github.io/aistplusplus_dataset respectively.
Qiaochu Huang, Xu He, Boshi Tang et al.
Dance generation, as a branch of human motion generation, has attracted increasing attention. Recently, a few works attempt to enhance dance expressiveness, which includes genre matching, beat alignment, and dance dynamics, from certain aspects. However, the enhancement is quite limited as they lack comprehensive consideration of the aforementioned three factors. In this paper, we propose ExpressiveBailando, a novel dance generation method designed to generate expressive dances, concurrently taking all three factors into account. Specifically, we mitigate the issue of speed homogenization by incorporating frequency information into VQ-VAE, thus improving dance dynamics. Additionally, we integrate music style information by extracting genre- and beat-related features with a pre-trained music model, hence achieving improvements in the other two factors. Extensive experimental results demonstrate that our proposed method can generate dances with high expressiveness and outperforms existing methods both qualitatively and quantitatively.
Sifei Li, Weiming Dong, Yuxin Zhang et al.
The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the immersive quality of gaming experiences and animation productions. Although there has been remarkable advancement in creating high-fidelity music from textual descriptions, current methodologies mainly focus on modulating overall characteristics such as genre and emotional tone. They often overlook the nuanced management of temporal rhythm, which is indispensable in crafting music for dance, since it intricately aligns the musical beats with the dancers' movements. Recognizing this gap, we propose an encoder-based textual inversion technique to augment text-to-music models with visual control, facilitating personalized music generation. Specifically, we develop dual-path rhythm-genre inversion to effectively integrate the rhythm and genre of a dance motion sequence into the textual space of a text-to-music model. Contrary to traditional textual inversion methods, which directly update text embeddings to reconstruct a single target object, our approach utilizes separate rhythm and genre encoders to obtain text embeddings for two pseudo-words, adapting to the varying rhythms and genres. We collect a new dataset called In-the-wild Dance Videos (InDV) and demonstrate that our approach outperforms state-of-the-art methods across multiple evaluation metrics. Furthermore, our method is able to adapt to changes in tempo and effectively integrates with the inherent text-guided generation capability of the pre-trained model. Our source code and demo videos are available at \url{https://github.com/lsfhuihuiff/Dance-to-music_Siggraph_Asia_2024}
Zixuan Wang, Jiayi Li, Xiaoyu Qin et al.
Synthesizing camera movements from music and dance is highly challenging due to the contradicting requirements and complexities of dance cinematography. Unlike human movements, which are always continuous, dance camera movements involve both continuous sequences of variable lengths and sudden drastic changes to simulate the switching of multiple cameras. However, in previous works, every camera frame is equally treated and this causes jittering and unavoidable smoothing in post-processing. To solve these problems, we propose to integrate animator dance cinematography knowledge by formulating this task as a three-stage process: keyframe detection, keyframe synthesis, and tween function prediction. Following this formulation, we design a novel end-to-end dance camera synthesis framework \textbf{DanceCamAnimator}, which imitates human animation procedures and shows powerful keyframe-based controllability with variable lengths. Extensive experiments on the DCM dataset demonstrate that our method surpasses previous baselines quantitatively and qualitatively. Code will be available at \url{https://github.com/Carmenw1203/DanceCamAnimator-Official}.
Kaixing Yang, Xulong Tang, Haoyu Wu et al.
Dance generation is crucial and challenging, particularly in domains like dance performance and virtual gaming. In the current body of literature, most methodologies focus on Solo Music2Dance. While there are efforts directed towards Group Music2Dance, these often suffer from a lack of coherence, resulting in aesthetically poor dance performances. Thus, we introduce CoheDancers, a novel framework for Music-Driven Interactive Group Dance Generation. CoheDancers aims to enhance group dance generation coherence by decomposing it into three key aspects: synchronization, naturalness, and fluidity. Correspondingly, we develop a Cycle Consistency based Dance Synchronization strategy to foster music-dance correspondences, an Auto-Regressive-based Exposure Bias Correction strategy to enhance the fluidity of the generated dances, and an Adversarial Training Strategy to augment the naturalness of the group dance output. Collectively, these strategies enable CohdeDancers to produce highly coherent group dances with superior quality. Furthermore, to establish better benchmarks for Group Music2Dance, we construct the most diverse and comprehensive open-source dataset to date, I-Dancers, featuring rich dancer interactions, and create comprehensive evaluation metrics. Experimental evaluations on I-Dancers and other extant datasets substantiate that CoheDancers achieves unprecedented state-of-the-art performance. Code will be released.
Li Siyao, Tianpei Gu, Zhitao Yang et al.
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment, which necessitates the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. Unlike existing solo or group dance generation tasks, a duet dance scenario entails a heightened degree of interaction between the two participants, requiring delicate coordination in both pose and position. To support this task, we first build a large-scale and diverse duet interactive dance dataset, DD100, by recording about 117 minutes of professional dancers' performances. To address the challenges inherent in this task, we propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements. To further enhance the GPT's capabilities of generating stable results on unseen conditions (music and leader motions), we devise an off-policy reinforcement learning strategy that allows the model to explore viable trajectories from out-of-distribution samplings, guided by human-defined rewards. Based on the collected dataset and proposed method, we establish a benchmark with several carefully designed metrics.
Ronghui Li, Youliang Zhang, Yachao Zhang et al.
Humans perform a variety of interactive motions, among which duet dance is one of the most challenging interactions. However, in terms of human motion generative models, existing works are still unable to generate high-quality interactive motions, especially in the field of duet dance. On the one hand, it is due to the lack of large-scale high-quality datasets. On the other hand, it arises from the incomplete representation of interactive motion and the lack of fine-grained optimization of interactions. To address these challenges, we propose, InterDance, a large-scale duet dance dataset that significantly enhances motion quality, data scale, and the variety of dance genres. Built upon this dataset, we propose a new motion representation that can accurately and comprehensively describe interactive motion. We further introduce a diffusion-based framework with an interaction refinement guidance strategy to optimize the realism of interactions progressively. Extensive experiments demonstrate the effectiveness of our dataset and algorithm.
Hye-Won Hwang
Bosheng Qin, Wentao Ye, Qifan Yu et al.
The rising demand for creating lifelike avatars in the digital realm has led to an increased need for generating high-quality human videos guided by textual descriptions and poses. We propose Dancing Avatar, designed to fabricate human motion videos driven by poses and textual cues. Our approach employs a pretrained T2I diffusion model to generate each video frame in an autoregressive fashion. The crux of innovation lies in our adept utilization of the T2I diffusion model for producing video frames successively while preserving contextual relevance. We surmount the hurdles posed by maintaining human character and clothing consistency across varying poses, along with upholding the background's continuity amidst diverse human movements. To ensure consistent human appearances across the entire video, we devise an intra-frame alignment module. This module assimilates text-guided synthesized human character knowledge into the pretrained T2I diffusion model, synergizing insights from ChatGPT. For preserving background continuity, we put forth a background alignment pipeline, amalgamating insights from segment anything and image inpainting techniques. Furthermore, we propose an inter-frame alignment module that draws inspiration from an auto-regressive pipeline to augment temporal consistency between adjacent frames, where the preceding frame guides the synthesis process of the current frame. Comparisons with state-of-the-art methods demonstrate that Dancing Avatar exhibits the capacity to generate human videos with markedly superior quality, both in terms of human and background fidelity, as well as temporal coherence compared to existing state-of-the-art approaches.
Haipeng Fang, Zhihao Sun, Ziyao Huang et al.
The advancement of generative AI has extended to the realm of Human Dance Generation, demonstrating superior generative capacities. However, current methods still exhibit deficiencies in achieving spatiotemporal consistency, resulting in artifacts like ghosting, flickering, and incoherent motions. In this paper, we present Dance-Your-Latents, a framework that makes latents dance coherently following motion flow to generate consistent dance videos. Firstly, considering that each constituent element moves within a confined space, we introduce spatial-temporal subspace-attention blocks that decompose the global space into a combination of regular subspaces and efficiently model the spatiotemporal consistency within these subspaces. This module enables each patch pay attention to adjacent areas, mitigating the excessive dispersion of long-range attention. Furthermore, observing that body part's movement is guided by pose control, we design motion flow guided subspace align & restore. This method enables the attention to be computed on the irregular subspace along the motion flow. Experimental results in TikTok dataset demonstrate that our approach significantly enhances spatiotemporal consistency of the generated videos.
Tan Wang, Linjie Li, Kevin Lin et al.
Generative AI has made significant strides in computer vision, particularly in text-driven image/video synthesis (T2I/T2V). Despite the notable advancements, it remains challenging in human-centric content synthesis such as realistic dance generation. Current methodologies, primarily tailored for human motion transfer, encounter difficulties when confronted with real-world dance scenarios (e.g., social media dance), which require to generalize across a wide spectrum of poses and intricate human details. In this paper, we depart from the traditional paradigm of human motion transfer and emphasize two additional critical attributes for the synthesis of human dance content in social media contexts: (i) Generalizability: the model should be able to generalize beyond generic human viewpoints as well as unseen human subjects, backgrounds, and poses; (ii) Compositionality: it should allow for the seamless composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce DISCO, which includes a novel model architecture with disentangled control to improve the compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DisCc can generate high-quality human dance images and videos with diverse appearances and flexible motions. Code is available at https://disco-dance.github.io/.
Zixiang Zhou, Weiyuan Li, Baoyuan Wang
We propose MDSC(Music-Dance-Style Consistency), the first evaluation metric that assesses to what degree the dance moves and music match. Existing metrics can only evaluate the motion fidelity and diversity and the degree of rhythmic matching between music and dance. MDSC measures how stylistically correlated the generated dance motion sequences and the conditioning music sequences are. We found that directly measuring the embedding distance between motion and music is not an optimal solution. We instead tackle this through modeling it as a clustering problem. Specifically, 1) we pre-train a music encoder and a motion encoder, then 2) we learn to map and align the motion and music embedding in joint space by jointly minimizing the intra-cluster distance and maximizing the inter-cluster distance, and 3) for evaluation purposes, we encode the dance moves into embedding and measure the intra-cluster and inter-cluster distances, as well as the ratio between them. We evaluate our metric on the results of several music-conditioned motion generation methods, combined with user study, we found that our proposed metric is a robust evaluation metric in measuring the music-dance style correlation.
Halaman 6 dari 9992