Hasil untuk "Music"

Menampilkan 20 dari ~1058339 hasil · dari CrossRef, arXiv, DOAJ, Semantic Scholar

JSON API
S2 Open Access 2018
Bulk tissue cell type deconvolution with multi-subject single-cell expression reference

Xuran Wang, Jihwan Park, K. Suszták et al.

Knowledge of cell type composition in disease relevant tissues is an important step towards the identification of cellular targets of disease. We present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. By appropriate weighting of genes showing cross-subject and cross-cell consistency, MuSiC enables the transfer of cell type-specific gene expression information from one dataset to another. When applied to pancreatic islet and whole kidney expression data in human, mouse, and rats, MuSiC outperformed existing methods, especially for tissues with closely related cell types. MuSiC enables the characterization of cellular heterogeneity of complex tissues for understanding of disease mechanisms. As bulk tissue data are more easily accessible than single-cell RNA-seq, MuSiC allows the utilization of the vast amounts of disease relevant bulk tissue RNA-seq data for elucidating cell type contributions in disease. Bulk tissue RNA-seq data reveals transcriptomic profiles but masks the contributions of different cell types. Here, the authors develop a new method for estimating cell type proportions from bulk tissue RNA-seq data guided by multi-subject single-cell expression reference.

757 sitasi en Medicine, Biology
arXiv Open Access 2025
Familiarizing with Music: Discovery Patterns for Different Music Discovery Needs

Marta Moscati, Darius Afchar, Markus Schedl et al.

Humans have the tendency to discover and explore. This natural tendency is reflected in data from streaming platforms as the amount of previously unknown content accessed by users. Additionally, in domains such as that of music streaming there is evidence that recommending novel content improves users' experience with the platform. Therefore, understanding users' discovery patterns, such as the amount to which and the way users access previously unknown content, is a topic of relevance for both the scientific community and the streaming industry, particularly the music one. Previous works studied how music consumption differs for users of different traits and looked at diversity, novelty, and consistency over time of users' music preferences. However, very little is known about how users discover and explore previously unknown music, and how this behavior differs for users of varying discovery needs. In this paper we bridge this gap by analyzing data from a survey answered by users of the major music streaming platform Deezer in combination with their streaming data. We first address questions regarding whether users who declare a higher interest in unfamiliar music listen to more diverse music, have more stable music preferences over time, and explore more music within a same time window, compared to those who declare a lower interest. We then investigate which type of music tracks users choose to listen to when they explore unfamiliar music, identifying clear patterns of popularity and genre representativeness that vary for users of different discovery needs. Our findings open up possibilities to infer users' interest in unfamiliar music from streaming data as well as possibilities to develop recommender systems that guide users in exploring music in a more natural way.

en cs.IR, cs.HC
arXiv Open Access 2025
Music Arena: Live Evaluation for Text-to-Music

Yonghyun Kim, Wayne Chi, Anastasios N. Angelopoulos et al.

We present Music Arena, an open platform for scalable human preference evaluation of text-to-music (TTM) models. Soliciting human preferences via listening studies is the gold standard for evaluation in TTM, but these studies are expensive to conduct and difficult to compare, as study protocols may differ across systems. Moreover, human preferences might help researchers align their TTM systems or improve automatic evaluation metrics, but an open and renewable source of preferences does not currently exist. We aim to fill these gaps by offering *live* evaluation for TTM. In Music Arena, real-world users input text prompts of their choosing and compare outputs from two TTM systems, and their preferences are used to compile a leaderboard. While Music Arena follows recent evaluation trends in other AI domains, we also design it with key features tailored to music: an LLM-based routing system to navigate the heterogeneous type signatures of TTM systems, and the collection of *detailed* preferences including listening data and natural language feedback. We also propose a rolling data release policy with user privacy guarantees, providing a renewable source of preference data and increasing platform transparency. Through its standardized evaluation protocol, transparent data access policies, and music-specific features, Music Arena not only addresses key challenges in the TTM ecosystem but also demonstrates how live evaluation can be thoughtfully adapted to unique characteristics of specific AI domains. Music Arena is available at: https://music-arena.org . Preference data is available at: https://huggingface.co/music-arena .

en cs.SD, cs.AI
arXiv Open Access 2025
MuseCPBench: an Empirical Study of Music Editing Methods through Music Context Preservation

Yash Vishe, Eric Xue, Xunyi Jiang et al.

Music editing plays a vital role in modern music production, with applications in film, broadcasting, and game development. Recent advances in music generation models have enabled diverse editing tasks such as timbre transfer, instrument substitution, and genre transformation. However, many existing works overlook the evaluation of their ability to preserve musical facets that should remain unchanged during editing a property we define as Music Context Preservation (MCP). While some studies do consider MCP, they adopt inconsistent evaluation protocols and metrics, leading to unreliable and unfair comparisons. To address this gap, we introduce the first MCP evaluation benchmark, MuseCPBench, which covers four categories of musical facets and enables comprehensive comparisons across five representative music editing baselines. Through systematic analysis along musical facets, methods, and models, we identify consistent preservation gaps in current music editing methods and provide insightful explanations. We hope our findings offer practical guidance for developing more effective and reliable music editing strategies with strong MCP capability

en cs.SD, cs.AI
arXiv Open Access 2025
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models

Atharva Mehta, Shivam Chauhan, Amirbek Djanibekov et al.

The advent of Music-Language Models has greatly enhanced the automatic music generation capability of AI systems, but they are also limited in their coverage of the musical genres and cultures of the world. We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. We find that only 5.7% of the total hours of existing music datasets come from non-Western genres, which naturally leads to disparate performance of the models across genres. We then investigate the efficacy of Parameter-Efficient Fine-Tuning (PEFT) techniques in mitigating this bias. Our experiments with two popular models -- MusicGen and Mustango, for two underrepresented non-Western music traditions -- Hindustani Classical and Turkish Makam music, highlight the promises as well as the non-triviality of cross-genre adaptation of music through small datasets, implying the need for more equitable baseline music-language models that are designed for cross-cultural transfer learning.

en cs.SD, cs.AI
arXiv Open Access 2025
Live Music Models

Lyria Team, Antoine Caillon, Brian McWilliams et al.

We introduce a new class of generative models for music called live music models that produce a continuous stream of music in real-time with synchronized user control. We release Magenta RealTime, an open-weights live music model that can be steered using text or audio prompts to control acoustic style. On automatic metrics of music quality, Magenta RealTime outperforms other open-weights music generation models, despite using fewer parameters and offering first-of-its-kind live generation capabilities. We also release Lyria RealTime, an API-based model with extended controls, offering access to our most powerful model with wide prompt coverage. These models demonstrate a new paradigm for AI-assisted music creation that emphasizes human-in-the-loop interaction for live music performance.

en cs.SD, cs.HC
arXiv Open Access 2025
SteerMusic: Enhanced Musical Consistency for Zero-shot Text-guided and Personalized Music Editing

Xinlei Niu, Kin Wai Cheuk, Jing Zhang et al.

Music editing is an important step in music production, which has broad applications, including game development and film production. Most existing zero-shot text-guided editing methods rely on pretrained diffusion models by involving forward-backward diffusion processes. However, these methods often struggle to preserve the musical content. Additionally, text instructions alone usually fail to accurately describe the desired music. In this paper, we propose two music editing methods that improve the consistency between the original and edited music by leveraging score distillation. The first method, SteerMusic, is a coarse-grained zero-shot editing approach using delta denoising score. The second method, SteerMusic+, enables fine-grained personalized music editing by manipulating a concept token that represents a user-defined musical style. SteerMusic+ allows for the editing of music into user-defined musical styles that cannot be achieved by the text instructions alone. Experimental results show that our methods outperform existing approaches in preserving both music content consistency and editing fidelity. User studies further validate that our methods achieve superior music editing quality.

en cs.SD, cs.MM
DOAJ Open Access 2025
A Mindfulness-Based App Intervention for Pregnant Women: Qualitative Evaluation of a Prototype Using Multiple Case Studies

Silvia Rizzi, Maria Chiara Pavesi, Alessia Moser et al.

BackgroundPregnancy is a complex period characterized by significant transformations. How a woman adapts to these changes can affect her quality of life and psychological well-being. Recently developed digital solutions have assumed a crucial role in supporting the psychological well-being of pregnant women. However, these tools have mainly been developed for women who already present clinically relevant psychological symptoms or mental disorders. ObjectiveThis study aimed to develop a mindfulness-based well-being intervention for all pregnant women that can be delivered electronically and guided by an online assistant with wide reach and dissemination. This paper aimed to describe a prototype technology-based mindfulness intervention’s design and development process for pregnant women, including the exploration phase, intervention content development, and iterative software development (including design, development, and formative evaluation of paper and low-fidelity prototypes). MethodsDesign and development processes were iterative and performed in close collaboration with key stakeholders (N=15), domain experts including mindfulness experts (n=2), communication experts (n=2), and psychologists (n=3), and target users including pregnant women (n=2), mothers with young children (n=2), and midwives (n=4). User-centered and service design methods, such as interviews and usability testing, were included to ensure user involvement in each phase. Domain experts evaluated a paper prototype, while target users evaluated a low-fidelity prototype. Intervention content was developed by psychologists and mindfulness experts based on the Mindfulness-Based Childbirth and Parenting program and adjusted to an electronic format through multiple iterations with stakeholders. ResultsAn 8-session intervention in a prototype electronic format using text, audio, video, and images was designed. In general, the prototypes were evaluated positively by the users involved. The questionnaires showed that domain experts, for instance, positively evaluated chatbot-related aspects such as empathy and comprehensibility of the terms used and rated the mindfulness traces present as supportive and functional. The target users found the content interesting and clear. However, both parties regarded the listening as not fully active. In addition, the interviews made it possible to pick up useful suggestions in order to refine the intervention. Domain experts suggested incorporating auditory components alongside textual content or substituting text entirely with auditory or audiovisual formats. Debate surrounded the inclusion of background music in mindfulness exercises, with opinions divided on its potential to either distract or aid in engagement. The target users proposed to supplement the app with some face-to-face meetings at crucial moments of the course, such as the beginning and the end. ConclusionsThis study illustrates how user-centered and service designs can be applied to identify and incorporate essential stakeholder aspects in the design and development process. Combined with evidence-based concepts, this process facilitated the development of a mindfulness intervention designed for the end users, in this case, pregnant women.

DOAJ Open Access 2025
From Waste to Art: A Study on Student Creativity and Creative Expression through Recycled Materials in Art Education

Sara Çebi

The research examines the use of waste materials as a teaching resource in art classes. Third year students of Trabzon University Faculty of Fine Arts and Design, Department of Painting collected and sorted textile wastes randomly thrown into the environment and transformed these materials into artworks by wood printing method in the school's printing workshop. In this study, which adopted exploratory, experimental and descriptive research methods, 12 artworks were analyzed. The study reveals the effects of instructional resources on classroom atmosphere, student performance and shows that improper management of textile waste contributes to environmental pollution with aims to draw attention to the effects of textile waste for increase environmental aesthetics and awareness in society. Within the scope of the Applied Workshop II course, students transformed textile wastes collected from the environment into works of art with wood printing method. Students were informed about the wood printing technique and the place of recycling in art, and in the light of this information, they transformed fabric wastes into artistic compositions. This process contributed to the students' practical application of their theoretical knowledge and environmental awareness. The works were designed according to the color element and the principle of balance. Artists use colors to describe and depict the subject. The principle of balance is important for a work to be clear and harmonious. Students were asked to create their compositions according to these elements. Students were encouraged to participate in comments and criticism and a program was organized to discuss art production together.

Fine Arts, Music
DOAJ Open Access 2025
Daughter and disciple: on gender and male gaze in the Spanish media image of the composer Ann-Elise Hannikainen in the early 1970s

Markus Virtanen

This article explores the media representation of Finnish-born composer Ann-Elise Hannikainen in the Spanish media during the early 1970s, focusing on the gender dynamics and the influence of the male gaze on her public image. Despite the presence of numerous female composers in Spain at the time, Hannikainen’s and Valencia-based Matilde Salvador’s works were among the few by women featured by Spanish orchestras in the 1970s. This study aims to understand how Hannikainen’s gender intersected with various aspects of her identity, such as age, appearance, social class, family background, education, and nationality, in the critiques and other texts related to her orchestral piece Anerfálicas premiered in Valencia in 1973. The methodology employs resistant reading by Judith Fetterley to analyse how gender and the male gaze shaped the discourse around Hannikainen’s work, underscoring the necessity of a feminist perspective in musicology that acknowledges the contributions of women composers and challenges the traditional narratives of music history. Additionally, by contrasting Hannikainen’s media image with that of Salvador, the article reveals that Hannikainen’s gender not only shaped her public image through descriptions of her appearance and familial relations but also affected the depth of authorship and artistic integrity attributed to her work, often overshadowing her professional credentials and accomplishments. This gendered narrative extended to the way influential figures, such as Hannikainen’s teacher Ernesto Halffter, represented Hannikainen.

Music and books on Music, Music
arXiv Open Access 2024
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval

SeungHeon Doh, Minhee Lee, Dasaem Jeong et al.

Text-to-Music Retrieval, finding music based on a given natural language query, plays a pivotal role in content discovery within extensive music databases. To address this challenge, prior research has predominantly focused on a joint embedding of music audio and text, utilizing it to retrieve music tracks that exactly match descriptive queries related to musical attributes (i.e. genre, instrument) and contextual elements (i.e. mood, theme). However, users also articulate a need to explore music that shares similarities with their favorite tracks or artists, such as \textit{I need a similar track to Superstition by Stevie Wonder}. To address these concerns, this paper proposes an improved Text-to-Music Retrieval model, denoted as TTMR++, which utilizes rich text descriptions generated with a finetuned large language model and metadata. To accomplish this, we obtained various types of seed text from several existing music tag and caption datasets and a knowledge graph dataset of artists and tracks. The experimental results show the effectiveness of TTMR++ in comparison to state-of-the-art music-text joint embedding models through a comprehensive evaluation involving various musical text queries.

en cs.SD, cs.IR
arXiv Open Access 2024
Music Grounding by Short Video

Zijie Xin, Minquan Wang, Jingyu Liu et al.

Adding proper background music helps complete a short video to be shared. Previous work tackles the task by video-to-music retrieval (V2MR), aiming to find the most suitable music track from a collection to match the content of a given query video. In practice, however, music tracks are typically much longer than the query video, necessitating (manual) trimming of the retrieved music to a shorter segment that matches the video duration. In order to bridge the gap between the practical need for music moment localization and V2MR, we propose a new task termed Music Grounding by Short Video (MGSV). To tackle the new task, we introduce a new benchmark, MGSV-EC, which comprises a diverse set of 53k short videos associated with 35k different music moments from 4k unique music tracks. Furthermore, we develop a new baseline method, MaDe, which performs both video-to-music matching and music moment detection within a unified end-to-end deep network. Extensive experiments on MGSV-EC not only highlight the challenging nature of MGSV but also set MaDe as a strong baseline.

en cs.MM
arXiv Open Access 2024
Flexible Control in Symbolic Music Generation via Musical Metadata

Sangjun Han, Jiwon Ham, Chaeeun Lee et al.

In this work, we introduce the demonstration of symbolic music generation, focusing on providing short musical motifs that serve as the central theme of the narrative. For the generation, we adopt an autoregressive model which takes musical metadata as inputs and generates 4 bars of multitrack MIDI sequences. During training, we randomly drop tokens from the musical metadata to guarantee flexible control. It provides users with the freedom to select input types while maintaining generative performance, enabling greater flexibility in music composition. We validate the effectiveness of the strategy through experiments in terms of model capacity, musical fidelity, diversity, and controllability. Additionally, we scale up the model and compare it with other music generation model through a subjective test. Our results indicate its superiority in both control and music quality. We provide a URL link https://www.youtube.com/watch?v=-0drPrFJdMQ to our demonstration video.

en cs.SD, cs.MM
arXiv Open Access 2024
Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation

Jan Retkowski, Jakub Stępniak, Mateusz Modrzejewski

In this paper we introduce the Frechet Music Distance (FMD), a novel evaluation metric for generative symbolic music models, inspired by the Frechet Inception Distance (FID) in computer vision and Frechet Audio Distance (FAD) in generative audio. FMD calculates the distance between distributions of reference and generated symbolic music embeddings, capturing abstract musical features. We validate FMD across several datasets and models. Results indicate that FMD effectively differentiates model quality, providing a domain-specific metric for evaluating symbolic music generation, and establishing a reproducible standard for future research in symbolic music modeling.

en cs.SD, cs.AI
arXiv Open Access 2024
Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial Training

Haesun Joung, Kyogu Lee

Music auto-tagging is crucial for enhancing music discovery and recommendation. Existing models in Music Information Retrieval (MIR) struggle with real-world noise such as environmental and speech sounds in multimedia content. This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. The approach integrates Domain Adversarial Training (DAT) into the music domain, enabling robust music representations that withstand noise. Unlike previous research, this approach involves an additional pretraining phase for the domain classifier, to avoid performance degradation in the subsequent phase. Adding various synthesized noisy music data improves the model's generalization across different noise levels. The proposed architecture demonstrates enhanced performance in music auto-tagging by effectively utilizing unlabeled noisy music data. Additional experiments with supplementary unlabeled data further improves the model's performance, underscoring its robust generalization capabilities and broad applicability.

en cs.SD, cs.AI
arXiv Open Access 2024
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Yan-Bo Lin, Yu Tian, Linjie Yang et al.

We present a framework for learning to generate background music from video inputs. Unlike existing works that rely on symbolic musical annotations, which are limited in quantity and diversity, our method leverages large-scale web videos accompanied by background music. This enables our model to learn to generate realistic and diverse music. To accomplish this goal, we develop a generative video-music Transformer with a novel semantic video-music alignment scheme. Our model uses a joint autoregressive and contrastive learning objective, which encourages the generation of music aligned with high-level video content. We also introduce a novel video-beat alignment scheme to match the generated music beats with the low-level motions in the video. Lastly, to capture fine-grained visual cues in a video needed for realistic background music generation, we introduce a new temporal video encoder architecture, allowing us to efficiently process videos consisting of many densely sampled frames. We train our framework on our newly curated DISCO-MV dataset, consisting of 2.2M video-music samples, which is orders of magnitude larger than any prior datasets used for video music generation. Our method outperforms existing approaches on the DISCO-MV and MusicCaps datasets according to various music generation evaluation metrics, including human evaluation. Results are available at https://genjib.github.io/project_page/VMAs/index.html

en cs.MM, cs.CV
arXiv Open Access 2024
Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings

Tanisha Hisariya, Huan Zhang, Jinhua Liang

Rapid advancements in artificial intelligence have significantly enhanced generative tasks involving music and images, employing both unimodal and multimodal approaches. This research develops a model capable of generating music that resonates with the emotions depicted in visual arts, integrating emotion labeling, image captioning, and language models to transform visual inputs into musical compositions. Addressing the scarcity of aligned art and music data, we curated the Emotion Painting Music Dataset, pairing paintings with corresponding music for effective training and evaluation. Our dual-stage framework converts images to text descriptions of emotional content and then transforms these descriptions into music, facilitating efficient learning with minimal data. Performance is evaluated using metrics such as Fréchet Audio Distance (FAD), Total Harmonic Distortion (THD), Inception Score (IS), and KL divergence, with audio-emotion text similarity confirmed by the pre-trained CLAP model to demonstrate high alignment between generated music and text. This synthesis tool bridges visual art and music, enhancing accessibility for the visually impaired and opening avenues in educational and therapeutic applications by providing enriched multi-sensory experiences.

en cs.SD, cs.CV
DOAJ Open Access 2024
A Robust Indicator Mean-Based Method for Estimating Generalizability Theory Absolute Error and Related Dependability Indices within Structural Equation Modeling Frameworks

Hyeryung Lee, Walter P. Vispoel

In this study, we introduce a novel and robust approach for computing Generalizability Theory (GT) absolute error and related dependability indices using indicator intercepts that represent observed means within structural equation models (SEMs). We demonstrate the applicability of our method using one-, two-, and three-facet designs with self-report measures having varying numbers of scale points. Results for the indicator mean-based method align well with those obtained from the <i>GENOVA</i> and R <i>gtheory</i> packages for doing conventional GT analyses and improve upon previously suggested methods for deriving absolute error and corresponding dependability indices from SEMs when analyzing three-facet designs. We further extend our approach to derive Monte Carlo confidence intervals for all key indices and to incorporate estimation procedures that correct for scale coarseness effects commonly observed when analyzing binary or ordinal data.

DOAJ Open Access 2024
Video-based diagnosis support system for pianists with Musician’s dystonia

Takanori Oku, Takanori Oku, Takanori Oku et al.

BackgroundMusician’s dystonia is a task-specific movement disorder that deteriorates fine motor control of skilled movements in musical performance. Although this disorder threatens professional careers, its diagnosis is challenging for clinicians who have no specialized knowledge of musical performance.ObjectivesTo support diagnostic evaluation, the present study proposes a novel approach using a machine learning-based algorithm to identify the symptomatic movements of Musician’s dystonia.MethodsWe propose an algorithm that identifies the dystonic movements using the anomaly detection method with an autoencoder trained with the hand kinematics of healthy pianists. A unique feature of the algorithm is that it requires only the video image of the hand, which can be derived by a commercially available camera. We also measured the hand biomechanical functions to assess the contribution of peripheral factors and improve the identification of dystonic symptoms.ResultsThe proposed algorithm successfully identified Musician’s dystonia with an accuracy and specificity of 90% based only on video footages of the hands. In addition, we identified the degradation of biomechanical functions involved in controlling multiple fingers, which is not specific to musical performance. By contrast, there were no dystonia-specific malfunctions of hand biomechanics, including the strength and agility of individual digits.ConclusionThese findings demonstrate the effectiveness of the present technique in aiding in the accurate diagnosis of Musician’s dystonia.

Neurology. Diseases of the nervous system

Halaman 20 dari 52917