Hasil untuk "Music"

Menampilkan 20 dari ~1058323 hasil · dari CrossRef, arXiv, DOAJ, Semantic Scholar

JSON API
arXiv Open Access 2025
WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation

Emmanouil Karystinaios

Agentic AI has been standardized in industry as a practical paradigm for coordinating specialized models and tools to solve complex multimodal tasks. In this work, we present WeaveMuse, a multi-agent system for music understanding, symbolic composition, and audio synthesis. Each specialist agent interprets user requests, derives machine-actionable requirements (modalities, formats, constraints), and validates its own outputs, while a manager agent selects and sequences tools, mediates user interaction, and maintains state across turns. The system is extendable and deployable either locally, using quantization and inference strategies to fit diverse hardware budgets, or via the HFApi to preserve free community access to open models. Beyond out-of-the-box use, the system emphasizes controllability and adaptation through constraint schemas, structured decoding, policy-based inference, and parameter-efficient adapters or distilled variants that tailor models to MIR tasks. A central design goal is to facilitate intermodal interaction across text, symbolic notation and visualization, and audio, enabling analysis-synthesis-render loops and addressing cross-format constraints. The framework aims to democratize, implement, and make accessible MIR tools by supporting interchangeable open-source models of various sizes, flexible memory management, and reproducible deployment paths.

en cs.SD, eess.AS
arXiv Open Access 2025
CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research

Monan Zhou, Shenyang Xu, Zhaorui Liu et al.

Data are crucial in various computer-related fields, including music information retrieval (MIR), an interdisciplinary area bridging computer science and music. This paper introduces CCMusic, an open and diverse database comprising multiple datasets specifically designed for tasks related to Chinese music, highlighting our focus on this culturally rich domain. The database integrates both published and unpublished datasets, with steps taken such as data cleaning, label refinement, and data structure unification to ensure data consistency and create ready-to-use versions. We conduct benchmark evaluations for all datasets using a unified evaluation framework developed specifically for this purpose. This publicly available framework supports both classification and detection tasks, ensuring standardized and reproducible results across all datasets. The database is hosted on HuggingFace and ModelScope, two open and multifunctional data and model hosting platforms, ensuring ease of accessibility and usability.

en cs.IR, cs.SD
arXiv Open Access 2025
A new XML conversion process for mensural music encoding : CMME\_to\_MEI (via Verovio)

David Fiala, Laurent Pugin, Marnix van Berchum et al.

The Ricercar Lab - the musicological research team at the Center for advanced Studies in the Renaissance at the University of Tours - has decided to make available in open access, thanks to the support of the French digital infrastructure Biblissima, a large corpus of about 3500 XML files of 15th-c. music. This corpus was produced by the German musicologist Clemens Goldberg who encoded since 2010 onwards the musical content of 34 major 15th-c. music manuscripts and other complementary files, in order to offer on his foundation's website PDF files of complete collections of works by Du Fay, Binchois, Okeghem, Busnoys and most of their major contemporaries, focusing on their secular output. This corpus was encoded in an XML format named CMME (Computerized Mensural Music Editing), specifically conceived for mensural music by Theodor Dumitrescu in the 2000s, together with editorial and publication tools which have not been updated since then. This article focuses on the development of a set of conversion tools for these CMME files to meet more up-to-date standards of music encoding, namely MEI. A workshop was organised in September 2024 at the Campus Condorcet in Paris, gathering experts with a wide range of knowledge on mensural music notation, XML formats and programming. A converter was developped directly in the open-source rendering library Verovio, allowing the conversion from CMME to MEI mensural. A conversion to MEI CMN was implemented afterwards, enabling to load these files in common engraving softwares such as MuseScore with minimal loss of information. With the availability of a direct import of CMME-XML into Verovio, the corpus of existing CMME files gets a new life. Furthermore, since the stand-alone CMME editor still works fine and no alternative is available yet for native MEI, the converter offers a new pipeline for encoding and editing mensural music.

en cs.SD, cs.DB
arXiv Open Access 2025
Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces

Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer

Recently, the information content (IC) of predictions from a Generative Infinite-Vocabulary Transformer (GIVT) has been used to model musical expectancy and surprisal in audio. We investigate the effectiveness of such modelling using IC calculated with autoregressive diffusion models (ADMs). We empirically show that IC estimates of models based on two different diffusion ordinary differential equations (ODEs) describe diverse data better, in terms of negative log-likelihood, than a GIVT. We evaluate diffusion model IC's effectiveness in capturing surprisal aspects by examining two tasks: (1) capturing monophonic pitch surprisal, and (2) detecting segment boundaries in multi-track audio. In both tasks, the diffusion models match or exceed the performance of a GIVT. We hypothesize that the surprisal estimated at different diffusion process noise levels corresponds to the surprisal of music and audio features present at different audio granularities. Testing our hypothesis, we find that, for appropriate noise levels, the studied musical surprisal tasks' results improve. Code is provided on github.com/SonyCSLParis/audioic.

en cs.SD, cs.AI
DOAJ Open Access 2025
Rolul muzicii în construcția identitǎților etnice-culturale. Tinariwen și blues-ul tuaregilor din deșertul saharian

Livia Georgeta SUCIU

The strategies of world music have facilitated intercultural exchanges within the global music market and have broadened access to the diversity of musical experiences worldwide, by valorizing the cultural and ethnic specificity of the music. New world music fusions and new cultural identities have been constructed through the specific strategies and narratives of world music. We aim to deconstruct these highly mediated narratives, which have drawn attention to an exotic and little-known region of the world. We focus on the exemplary case of the group Tinariwen, which has contributed, through music, to shaping a new identity for the Tuareg people. We follow several investigation paths: How is the ethnic-cultural identity of the Tuaregs constructed through the music promoted by Tinariwen, in the context of post-colonialism and globalization? How have Tinariwen become the emblematic figure of resistance through music, of the struggle for autonomy and freedom of the Tuaregs in the context of political and cultural repression, under the exile pressure, and prohibitions on making music? We try to understand how Tinariwen have managed, through music to capture the attention of the entire world on the destiny of a people.

Philosophy (General), Language and Literature
arXiv Open Access 2024
Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation

Elena V. Epure, Gabriel Meseguer-Brocal, Darius Afchar et al.

Recommender systems relying on Language Models (LMs) have gained popularity in assisting users to navigate large catalogs. LMs often exploit item high-level descriptors, i.e. categories or consumption contexts, from training data or user preferences. This has been proven effective in domains like movies or products. However, in the music domain, understanding how effectively LMs utilize song descriptors for natural language-based music recommendation is relatively limited. In this paper, we assess LMs effectiveness in recommending songs based on user natural language descriptions and items with descriptors like genres, moods, and listening contexts. We formulate the recommendation task as a dense retrieval problem and assess LMs as they become increasingly familiar with data pertinent to the task and domain. Our findings reveal improved performance as LMs are fine-tuned for general language similarity, information retrieval, and mapping longer descriptions to shorter, high-level descriptors in music.

en cs.IR
arXiv Open Access 2024
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation

Alain Riou, Stefan Lattner, Gaëtan Hadjeres et al.

This paper explores the automated process of determining stem compatibility by identifying audio recordings of single instruments that blend well with a given musical context. To tackle this challenge, we present Stem-JEPA, a novel Joint-Embedding Predictive Architecture (JEPA) trained on a multi-track dataset using a self-supervised learning approach. Our model comprises two networks: an encoder and a predictor, which are jointly trained to predict the embeddings of compatible stems from the embeddings of a given context, typically a mix of several instruments. Training a model in this manner allows its use in estimating stem compatibility - retrieving, aligning, or generating a stem to match a given mix - or for downstream tasks such as genre or key estimation, as the training paradigm requires the model to learn information related to timbre, harmony, and rhythm. We evaluate our model's performance on a retrieval task on the MUSDB18 dataset, testing its ability to find the missing stem from a mix and through a subjective user study. We also show that the learned embeddings capture temporal alignment information and, finally, evaluate the representations learned by our model on several downstream tasks, highlighting that they effectively capture meaningful musical features.

en cs.SD, cs.LG
arXiv Open Access 2024
Developing a Framework for Sonifying Variational Quantum Algorithms: Implications for Music Composition

Paulo Vitor Itaboraí, Peter Thomas, Arianna Crippa et al.

This chapter examines the Variational Quantum Harmonizer, a software tool and musical interface that focuses on the problem of sonification of the minimization steps of Variational Quantum Algorithms (VQA), used for simulating properties of quantum systems and optimization problems assisted by quantum hardware. Particularly, it details the sonification of Quadratic Unconstrained Binary Optimization (QUBO) problems using VQA. A flexible design enables its future applications both as a sonification tool for auditory displays in scientific investigation, and as a hybrid quantum-digital musical instrument for artistic endeavours. In turn, sonification can help researchers understand complex systems better and can serve for the training of quantum physics and quantum computing. The VQH structure, including its software implementation, control mechanisms, and sonification mappings are detailed. Moreover, it guides the design of QUBO cost functions in VQH as a music compositional object. The discussion is extended to the implications of applying quantum-assisted simulation in quantum-computer aided composition and live-coding performances. An artistic output is showcased by the piece \textit{Hexagonal Chambers} (Thomas and Itaboraí, 2023).

en cs.SD, cs.ET
arXiv Open Access 2023
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model

Pierre-Amaury Grumiaux, Mathieu Lagrange

The task of bandwidth extension addresses the generation of missing high frequencies of audio signals based on knowledge of the low-frequency part of the sound. This task applies to various problems, such as audio coding or audio restoration. In this article, we focus on efficient bandwidth extension of monophonic and polyphonic musical signals using a differentiable digital signal processing (DDSP) model. Such a model is composed of a neural network part with relatively few parameters trained to infer the parameters of a differentiable digital signal processing model, which efficiently generates the output full-band audio signal. We first address bandwidth extension of monophonic signals, and then propose two methods to explicitely handle polyphonic signals. The benefits of the proposed models are first demonstrated on monophonic and polyphonic synthetic data against a baseline and a deep-learning-based resnet model. The models are next evaluated on recorded monophonic and polyphonic data, for a wide variety of instruments and musical genres. We show that all proposed models surpass a higher complexity deep learning model for an objective metric computed in the frequency domain. A MUSHRA listening test confirms the superiority of the proposed approach in terms of perceptual quality.

en cs.SD, eess.AS
arXiv Open Access 2023
Employing Crowdsourcing for Enriching a Music Knowledge Base in Higher Education

Vassilis Lyberatos, Spyridon Kantarelis, Eirini Kaldeli et al.

This paper describes the methodology followed and the lessons learned from employing crowdsourcing techniques as part of a homework assignment involving higher education students of computer science. Making use of a platform that supports crowdsourcing in the cultural heritage domain students were solicited to enrich the metadata associated with a selection of music tracks. The results of the campaign were further analyzed and exploited by students through the use of semantic web technologies. In total, 98 students participated in the campaign, contributing more than 6400 annotations concerning 854 tracks. The process also led to the creation of an openly available annotated dataset, which can be useful for machine learning models for music tagging. The campaign's results and the comments gathered through an online survey enable us to draw some useful insights about the benefits and challenges of integrating crowdsourcing into computer science curricula and how this can enhance students' engagement in the learning process.

en cs.HC, cs.AI
arXiv Open Access 2022
Challenges in creative generative models for music: a divergence maximization perspective

Axel Chemla--Romeu-Santos, Philippe Esling

The development of generative Machine Learning (ML) models in creative practices, enabled by the recent improvements in usability and availability of pre-trained models, is raising more and more interest among artists, practitioners and performers. Yet, the introduction of such techniques in artistic domains also revealed multiple limitations that escape current evaluation methods used by scientists. Notably, most models are still unable to generate content that lay outside of the domain defined by the training dataset. In this paper, we propose an alternative prospective framework, starting from a new general formulation of ML objectives, that we derive to delineate possible implications and solutions that already exist in the ML literature (notably for the audio and musical domain). We also discuss existing relations between generative models and computational creativity and how our framework could help address the lack of creativity in existing models.

en stat.ML, cs.LG
arXiv Open Access 2022
Defending a Music Recommender Against Hubness-Based Adversarial Attacks

Katharina Hoedt, Arthur Flexer, Gerhard Widmer

Adversarial attacks can drastically degrade performance of recommenders and other machine learning systems, resulting in an increased demand for defence mechanisms. We present a new line of defence against attacks which exploit a vulnerability of recommenders that operate in high dimensional data spaces (the so-called hubness problem). We use a global data scaling method, namely Mutual Proximity (MP), to defend a real-world music recommender which previously was susceptible to attacks that inflated the number of times a particular song was recommended. We find that using MP as a defence greatly increases robustness of the recommender against a range of attacks, with success rates of attacks around 44% (before defence) dropping to less than 6% (after defence). Additionally, adversarial examples still able to fool the defended system do so at the price of noticeably lower audio quality as shown by a decreased average SNR.

en eess.AS, cs.AI
DOAJ Open Access 2022
Research On The Development B&B Industry In Middle-Western Zhejiang Region Based On Internet+

Dongmei Guo

As China’s socio-economic development and people’s living standards have increased, tourism consumer demand has continued to increase, and concepts such as rural tourism and farmhouse music have constantly been proposed. Although there is no authoritative and unified explanation for the concept of bed and breakfast, it basically covers farmhouse music and rural tourism development model. In the era of sharing economy, the use of Internet technology to promote the development of Northeast bed and breakfast has broad prospects and significance. This paper investigates and analyses the development status of “Internet +” middle-western Zhejiang, B & B, discusses the current Internet economy under the middle-western Zhejiang B & B development problems, put forward development strategies of “Internet +” middle-western Zhejiang bed and breakfast from the bed and breakfast operators themselves and related departments, which has certain reference significance for the hotel operators and the government. The collected questionnaire results were collected in a timely manner and their symbiotic relationship was analyzed. It was concluded that the symbiotic organization model of the symbiotic system in the central and western parts of Zhejiang and the main tourist activities of the scenic spot is an intermittent symbiosis model, and the symbiotic behavior model is asymmetrical mutualism symbiosis. It can be found that the main problems of B&Bs around the scenic spot are as follows: disordered development, insufficient synergy, single marketing method, inconspicuous characteristics, uneven quality of employees and high overall pricing. While analyzing the causes of the problem, relevant suggestions are given on the mutual benefit and symbiosis between the B&Bs and the scenic spots around the Scenic Area in the central and western Zhejiang. In the development of homestays, it is necessary to highlight the subdivision of cultural models, relying on the natural and cultural environment in central and western Zhejiang and the advantages of big data "Internet +" is to carry out three-dimensional marketing, highlight characteristics, optimize service quality, and reasonably price to attract more tourists. For the scenic spot management department, it pays more attention to traffic management and overall planning guidance, grasps the role of demonstration and driving, fully relies on the advantages of "Internet +" in central and western Zhejiang, gives full play to the benefits of industry associations, formulates standard and innovative development models, and better enhances the surrounding area of scenic spots.

Engineering (General). Civil engineering (General), Chemical engineering
DOAJ Open Access 2022
GHOSTly flute music: drumlins, moats and the bed of Thwaites Glacier

Richard B. Alley, Nick Holschuh, Byron Parizek et al.

Glacier-bed characteristics that are poorly known and modeled are important in projected sea-level rise from ice-sheet changes under strong warming, especially in the Thwaites Glacier drainage of West Antarctica. Ocean warming may induce ice-shelf thinning or loss, or thinning of ice in estuarine zones, reducing backstress on grounded ice. Models indicate that, in response, more-nearly-plastic beds favor faster ice loss by causing larger flow acceleration, but more-nearly-viscous beds favor localized near-coastal thinning that could speed grounding-zone retreat into interior basins where marine-ice-sheet instability or cliff instability could develop and cause very rapid ice loss. Interpretation of available data indicates that the bed is spatially mosaicked, with both viscous and plastic regions. Flow against bedrock topography removes plastic lubricating tills, exposing bedrock that is eroded on up-glacier sides of obstacles to form moats with exposed bedrock tails extending downglacier adjacent to lee-side soft-till bedforms. Flow against topography also generates high-ice-pressure zones that prevent inflow of lubricating water over distances that scale with the obstacle size. Extending existing observations to sufficiently large regions, and developing models assimilating such data at the appropriate scale, present large, important research challenges that must be met to reliably project future forced sea-level rise.

Meteorology. Climatology

Halaman 16 dari 52917