The Effect of Music on the Human Stress Response
M. Thoma, R. La Marca, R. Brönnimann
et al.
Background Music listening has been suggested to beneficially impact health via stress-reducing effects. However, the existing literature presents itself with a limited number of investigations and with discrepancies in reported findings that may result from methodological shortcomings (e.g. small sample size, no valid stressor). It was the aim of the current study to address this gap in knowledge and overcome previous shortcomings by thoroughly examining music effects across endocrine, autonomic, cognitive, and emotional domains of the human stress response. Methods Sixty healthy female volunteers (mean age = 25 years) were exposed to a standardized psychosocial stress test after having been randomly assigned to one of three different conditions prior to the stress test: 1) relaxing music (‘Miserere’, Allegri) (RM), 2) sound of rippling water (SW), and 3) rest without acoustic stimulation (R). Salivary cortisol and salivary alpha-amylase (sAA), heart rate (HR), respiratory sinus arrhythmia (RSA), subjective stress perception and anxiety were repeatedly assessed in all subjects. We hypothesized that listening to RM prior to the stress test, compared to SW or R would result in a decreased stress response across all measured parameters. Results The three conditions significantly differed regarding cortisol response (p = 0.025) to the stressor, with highest concentrations in the RM and lowest in the SW condition. After the stressor, sAA (p=0.026) baseline values were reached considerably faster in the RM group than in the R group. HR and psychological measures did not significantly differ between groups. Conclusion Our findings indicate that music listening impacted the psychobiological stress system. Listening to music prior to a standardized stressor predominantly affected the autonomic nervous system (in terms of a faster recovery), and to a lesser degree the endocrine and psychological stress response. These findings may help better understanding the beneficial effects of music on the human body.
WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation
Emmanouil Karystinaios
Agentic AI has been standardized in industry as a practical paradigm for coordinating specialized models and tools to solve complex multimodal tasks. In this work, we present WeaveMuse, a multi-agent system for music understanding, symbolic composition, and audio synthesis. Each specialist agent interprets user requests, derives machine-actionable requirements (modalities, formats, constraints), and validates its own outputs, while a manager agent selects and sequences tools, mediates user interaction, and maintains state across turns. The system is extendable and deployable either locally, using quantization and inference strategies to fit diverse hardware budgets, or via the HFApi to preserve free community access to open models. Beyond out-of-the-box use, the system emphasizes controllability and adaptation through constraint schemas, structured decoding, policy-based inference, and parameter-efficient adapters or distilled variants that tailor models to MIR tasks. A central design goal is to facilitate intermodal interaction across text, symbolic notation and visualization, and audio, enabling analysis-synthesis-render loops and addressing cross-format constraints. The framework aims to democratize, implement, and make accessible MIR tools by supporting interchangeable open-source models of various sizes, flexible memory management, and reproducible deployment paths.
CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research
Monan Zhou, Shenyang Xu, Zhaorui Liu
et al.
Data are crucial in various computer-related fields, including music information retrieval (MIR), an interdisciplinary area bridging computer science and music. This paper introduces CCMusic, an open and diverse database comprising multiple datasets specifically designed for tasks related to Chinese music, highlighting our focus on this culturally rich domain. The database integrates both published and unpublished datasets, with steps taken such as data cleaning, label refinement, and data structure unification to ensure data consistency and create ready-to-use versions. We conduct benchmark evaluations for all datasets using a unified evaluation framework developed specifically for this purpose. This publicly available framework supports both classification and detection tasks, ensuring standardized and reproducible results across all datasets. The database is hosted on HuggingFace and ModelScope, two open and multifunctional data and model hosting platforms, ensuring ease of accessibility and usability.
A new XML conversion process for mensural music encoding : CMME\_to\_MEI (via Verovio)
David Fiala, Laurent Pugin, Marnix van Berchum
et al.
The Ricercar Lab - the musicological research team at the Center for advanced Studies in the Renaissance at the University of Tours - has decided to make available in open access, thanks to the support of the French digital infrastructure Biblissima, a large corpus of about 3500 XML files of 15th-c. music. This corpus was produced by the German musicologist Clemens Goldberg who encoded since 2010 onwards the musical content of 34 major 15th-c. music manuscripts and other complementary files, in order to offer on his foundation's website PDF files of complete collections of works by Du Fay, Binchois, Okeghem, Busnoys and most of their major contemporaries, focusing on their secular output. This corpus was encoded in an XML format named CMME (Computerized Mensural Music Editing), specifically conceived for mensural music by Theodor Dumitrescu in the 2000s, together with editorial and publication tools which have not been updated since then. This article focuses on the development of a set of conversion tools for these CMME files to meet more up-to-date standards of music encoding, namely MEI. A workshop was organised in September 2024 at the Campus Condorcet in Paris, gathering experts with a wide range of knowledge on mensural music notation, XML formats and programming. A converter was developped directly in the open-source rendering library Verovio, allowing the conversion from CMME to MEI mensural. A conversion to MEI CMN was implemented afterwards, enabling to load these files in common engraving softwares such as MuseScore with minimal loss of information. With the availability of a direct import of CMME-XML into Verovio, the corpus of existing CMME files gets a new life. Furthermore, since the stand-alone CMME editor still works fine and no alternative is available yet for native MEI, the converter offers a new pipeline for encoding and editing mensural music.
Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces
Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer
Recently, the information content (IC) of predictions from a Generative Infinite-Vocabulary Transformer (GIVT) has been used to model musical expectancy and surprisal in audio. We investigate the effectiveness of such modelling using IC calculated with autoregressive diffusion models (ADMs). We empirically show that IC estimates of models based on two different diffusion ordinary differential equations (ODEs) describe diverse data better, in terms of negative log-likelihood, than a GIVT. We evaluate diffusion model IC's effectiveness in capturing surprisal aspects by examining two tasks: (1) capturing monophonic pitch surprisal, and (2) detecting segment boundaries in multi-track audio. In both tasks, the diffusion models match or exceed the performance of a GIVT. We hypothesize that the surprisal estimated at different diffusion process noise levels corresponds to the surprisal of music and audio features present at different audio granularities. Testing our hypothesis, we find that, for appropriate noise levels, the studied musical surprisal tasks' results improve. Code is provided on github.com/SonyCSLParis/audioic.
“Hadī Māši Nāyḍa, Hadī Nahḍa”: música, identidad y heterodoxia en los umbrales de un paradigma
Ahmed Balghzal
El trabajo pretende determinar el papel que puede desempeñar la música juvenil marroquí en la construcción de un ideal identitario disidente. Partiendo del análisis del discurso musical de varios grupos contraculturales buscamos caracterizar el papel de sus formulaciones subversivas de nociones como la pertenencia y la identidad para crear fisuras en la construcción canónica de todo el paradigma cultural. Pretendemos llegar a la conclusión de que tal labor marginal y erosiva abre expectativas de cambio a todo el modelo cultural marroquí marcado por patologías y desajustes crónicos que impiden su regeneración y su metamorfosis.
Music and books on Music, Musical instruction and study
Janáčkova Její pastorkyňa a Suchoňova Krútňava
Markéta Štefková
Literature on music, Music
Depict or Discern? Fingerprinting Musical Taste from Explicit Preferences
Kristina Matrosova, Manuel Moussallam, Thomas Louail
et al.
The notion of personal taste in general, and musical taste in particular, is pervasive in the literature on recommender systems, but also in cultural sociology and psychology. However, definitions and measurement methods strongly differ from one study to another. In this paper, we question two different views on taste that can be retrieved from the literature: either something that is distinctive of an individual, or something that essentially captures the extent and diversity of their preferences. Relying upon a dataset that contains the complete list of musical items liked by individual users of a streaming service, as well as streaming logs, we propose two methods to compute fingerprints of their musical taste. The first one explicitly targets a uniqueness property, aiming at selecting items that uniquely identify a user in the crowd. The second approach focuses on a representativeness task that is fundamental in recommendation, i.e. building a summary depiction of the user’s preferences that can be leveraged to propose other items of interest. We demonstrate that the two methods lead to conflicting solutions, hence highlighting the need to precisely acknowledge which point of view applies when addressing a computational question related to taste. We also raise the question of users’ identifiability through their online activity on music streaming platforms, and beyond.
Information technology, Music
Music in János Térey’s Verse Novel A Legkisebb Jégkorszak [The Shortest Ice Age]
Lenke Kocsis
Mentioning composers, singers, bands, and well-known musical
events and citing lyrics have all been well-established and commonplace
practices in literature throughout the ages. The interpretation of these
gestures usually resembles the process of understanding other kinds of
allusions, references, and intertextuality because that is what the above
listed instances ultimately are. When it comes to art forms that combine
multiple forms of media, such as films, the use of music as background
score, soundtrack, etc. is usually perceived as a fundamental part of the
work, and its absence is always noted. Such frequent or constant appearance
of music in literary works is not commonplace, and it typically means that
the plot and theme of the text itself are music, or at least they are heavily
music-oriented. The 2015 verse novel of János Térey is quite peculiar in this
regard. On average, a song, a piece, a composer, or a band is mentioned on
every seventh page of the book. The music referenced includes numerous
genres from metal to pop, classical, jazz, and even hymns. The situations
vary from background music on the radio to a character listening to his/her
favourite song or someone reminiscing about an event. The variety of music
and the situations it appears in throughout the novel indicate a conscious
effort on the author’s part. The paper aims to examine how the various
lyrical citations and musical references found throughout János Térey’s A
Legkisebb Jégkorszak [The Shortest Ice Age] function as literary devices.
Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation
Elena V. Epure, Gabriel Meseguer-Brocal, Darius Afchar
et al.
Recommender systems relying on Language Models (LMs) have gained popularity in assisting users to navigate large catalogs. LMs often exploit item high-level descriptors, i.e. categories or consumption contexts, from training data or user preferences. This has been proven effective in domains like movies or products. However, in the music domain, understanding how effectively LMs utilize song descriptors for natural language-based music recommendation is relatively limited. In this paper, we assess LMs effectiveness in recommending songs based on user natural language descriptions and items with descriptors like genres, moods, and listening contexts. We formulate the recommendation task as a dense retrieval problem and assess LMs as they become increasingly familiar with data pertinent to the task and domain. Our findings reveal improved performance as LMs are fine-tuned for general language similarity, information retrieval, and mapping longer descriptions to shorter, high-level descriptors in music.
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation
Alain Riou, Stefan Lattner, Gaëtan Hadjeres
et al.
This paper explores the automated process of determining stem compatibility by identifying audio recordings of single instruments that blend well with a given musical context. To tackle this challenge, we present Stem-JEPA, a novel Joint-Embedding Predictive Architecture (JEPA) trained on a multi-track dataset using a self-supervised learning approach. Our model comprises two networks: an encoder and a predictor, which are jointly trained to predict the embeddings of compatible stems from the embeddings of a given context, typically a mix of several instruments. Training a model in this manner allows its use in estimating stem compatibility - retrieving, aligning, or generating a stem to match a given mix - or for downstream tasks such as genre or key estimation, as the training paradigm requires the model to learn information related to timbre, harmony, and rhythm. We evaluate our model's performance on a retrieval task on the MUSDB18 dataset, testing its ability to find the missing stem from a mix and through a subjective user study. We also show that the learned embeddings capture temporal alignment information and, finally, evaluate the representations learned by our model on several downstream tasks, highlighting that they effectively capture meaningful musical features.
Developing a Framework for Sonifying Variational Quantum Algorithms: Implications for Music Composition
Paulo Vitor Itaboraí, Peter Thomas, Arianna Crippa
et al.
This chapter examines the Variational Quantum Harmonizer, a software tool and musical interface that focuses on the problem of sonification of the minimization steps of Variational Quantum Algorithms (VQA), used for simulating properties of quantum systems and optimization problems assisted by quantum hardware. Particularly, it details the sonification of Quadratic Unconstrained Binary Optimization (QUBO) problems using VQA. A flexible design enables its future applications both as a sonification tool for auditory displays in scientific investigation, and as a hybrid quantum-digital musical instrument for artistic endeavours. In turn, sonification can help researchers understand complex systems better and can serve for the training of quantum physics and quantum computing. The VQH structure, including its software implementation, control mechanisms, and sonification mappings are detailed. Moreover, it guides the design of QUBO cost functions in VQH as a music compositional object. The discussion is extended to the implications of applying quantum-assisted simulation in quantum-computer aided composition and live-coding performances. An artistic output is showcased by the piece \textit{Hexagonal Chambers} (Thomas and Itaboraí, 2023).
When the music changes, so does the dance : do we still need copyright collectives?
František Svoboda
Literature on music, Music
Zwischen Himmel und Hölle : Kardinaltugenden und Todsünden in Wiener Oratorientexten des 18. Jahrhunderts
Elisabeth Theresia Hilscher
Literature on music, Music
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model
Pierre-Amaury Grumiaux, Mathieu Lagrange
The task of bandwidth extension addresses the generation of missing high frequencies of audio signals based on knowledge of the low-frequency part of the sound. This task applies to various problems, such as audio coding or audio restoration. In this article, we focus on efficient bandwidth extension of monophonic and polyphonic musical signals using a differentiable digital signal processing (DDSP) model. Such a model is composed of a neural network part with relatively few parameters trained to infer the parameters of a differentiable digital signal processing model, which efficiently generates the output full-band audio signal. We first address bandwidth extension of monophonic signals, and then propose two methods to explicitely handle polyphonic signals. The benefits of the proposed models are first demonstrated on monophonic and polyphonic synthetic data against a baseline and a deep-learning-based resnet model. The models are next evaluated on recorded monophonic and polyphonic data, for a wide variety of instruments and musical genres. We show that all proposed models surpass a higher complexity deep learning model for an objective metric computed in the frequency domain. A MUSHRA listening test confirms the superiority of the proposed approach in terms of perceptual quality.
Employing Crowdsourcing for Enriching a Music Knowledge Base in Higher Education
Vassilis Lyberatos, Spyridon Kantarelis, Eirini Kaldeli
et al.
This paper describes the methodology followed and the lessons learned from employing crowdsourcing techniques as part of a homework assignment involving higher education students of computer science. Making use of a platform that supports crowdsourcing in the cultural heritage domain students were solicited to enrich the metadata associated with a selection of music tracks. The results of the campaign were further analyzed and exploited by students through the use of semantic web technologies. In total, 98 students participated in the campaign, contributing more than 6400 annotations concerning 854 tracks. The process also led to the creation of an openly available annotated dataset, which can be useful for machine learning models for music tagging. The campaign's results and the comments gathered through an online survey enable us to draw some useful insights about the benefits and challenges of integrating crowdsourcing into computer science curricula and how this can enhance students' engagement in the learning process.
Effects of music training in executive function performance in children: A systematic review
Diego Alejandro Rodriguez-Gomez, Claudia Talero-Gutiérrez
Music training has traditionally been a fundamental component of children's education across several cultures. Moreover, music training has been hypothesized to enhance the development of executive functions and improve executive performance in children. In this systematic review, we analyze the available evidence of the effects of music training on executive function performance, evaluated using validated neuropsychologic batteries and classic tasks. To achieve this objective, we performed a systematic search in three databases (PubMed, Ovid MEDLINE, and Scopus) and selected case-control or intervention studies conducted on children with neurotypical development. We analyzed 29 studies that met the inclusion criteria and observed significant heterogeneity among the music interventions and methods for assessing executive functions. The review of the available literature suggests a beneficial effect of music training in core executive function performance, primarily in inhibitory control, and to a lesser extent, in working memory and cognitive flexibility.
Entropy, Probabilistic Harmonic Space, and the Harmony of Antonio Carlos Jobim
Carlos de Lemos Almada, Hugo Carvalho
This paper introduces a theoretical framework derived from a deep and detailed harmonic analysis of songs composed by Antonio Carlos Jobim, focusing on two components, namely, “semantic” (related to the idea of chord type) and “syntactic” (involving binary relations between contiguous chords). The research is mainly focused on investigating the correlations between compositional style (here related to the harmonic construction) and the concepts of probability, expectance, and, especially entropy, being the latter defined as a measure of uncertainty or “surprise” of events along time. After a bibliographical review of these topics and their applications to music, a section exposes Markov Chains, a mathematical tool used to formalize the “semantic-syntactic” harmonic relations statistically inferred in the analyzed corpus of Jobim’s works. Then it follows the formalization of a probabilistic harmonic space and the concept of probabilistic index, directly associated with the entropy of the observed binary relations. This approach opens a new analytical perspective, also allowing the generalization of the presented theoretical and methodological technology for the examination of other repertoires and posterior comparison, presenting then as a new mean of investigation on the nature of style.
Literature on music, Music
Defending a Music Recommender Against Hubness-Based Adversarial Attacks
Katharina Hoedt, Arthur Flexer, Gerhard Widmer
Adversarial attacks can drastically degrade performance of recommenders and other machine learning systems, resulting in an increased demand for defence mechanisms. We present a new line of defence against attacks which exploit a vulnerability of recommenders that operate in high dimensional data spaces (the so-called hubness problem). We use a global data scaling method, namely Mutual Proximity (MP), to defend a real-world music recommender which previously was susceptible to attacks that inflated the number of times a particular song was recommended. We find that using MP as a defence greatly increases robustness of the recommender against a range of attacks, with success rates of attacks around 44% (before defence) dropping to less than 6% (after defence). Additionally, adversarial examples still able to fool the defended system do so at the price of noticeably lower audio quality as shown by a decreased average SNR.
Music and the heart.
S. Koelsch, L. Jäncke