Hasil untuk "Music and books on Music"

Menampilkan 20 dari ~888320 hasil · dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2026
Probabilistic Multilabel Graphical Modelling of Motif Transformations in Symbolic Music

Ron Taieb, Yoel Greenberg, Barak Sober

Motifs often recur in musical works in altered forms, preserving aspects of their identity while undergoing local variation. This paper investigates how such motivic transformations occur within their musical context in symbolic music. To support this analysis, we develop a probabilistic framework for modeling motivic transformations and apply it to Beethoven's piano sonatas by integrating multiple datasets that provide melodic, rhythmic, harmonic, and motivic information within a unified analytical representation. Motif transformations are represented as multilabel variables by comparing each motif instance to a designated reference occurrence within its local context, ensuring consistent labeling across transformation families. We introduce a multilabel Conditional Random Field to model how motif-level musical features influence the occurrence of transformations and how different transformation families tend to co-occur. Our goal is to provide an interpretable, distributional analysis of motivic transformation patterns, enabling the study of their structural relationships and stylistic variation. By linking computational modeling with music-theoretical interpretation, the proposed framework supports quantitative investigation of musical structure and complexity in symbolic corpora and may facilitate the analysis of broader compositional patterns and writing practices.

en cs.SD, stat.ME
DOAJ Open Access 2025
The Complete Songs of Robert Tannahill: A Timely Appreciation

Jane Pettegree

A remarkable project to record the Complete Songs of Robert Tannahill, the Paisley weaver-poet, has concluded with the 2024 release of a fifth and final disc, a fitting tribute marking the 250th anniversary of Tannahill’s birth. This review article discusses why Tannahill is an important and distinctive voice in the Scottish traditional song repertoire, and assesses the achievements of the recording project.

Other beliefs and movements, Music
arXiv Open Access 2025
Story2MIDI: Emotionally Aligned Music Generation from Text

Mohammad Shokri, Alexandra C. Salem, Gabriel Levine et al.

In this paper, we introduce Story2MIDI, a sequence-to-sequence Transformer-based model for generating emotion-aligned music from a given piece of text. To develop this model, we construct the Story2MIDI dataset by merging existing datasets for sentiment analysis from text and emotion classification in music. The resulting dataset contains pairs of text blurbs and music pieces that evoke the same emotions in the reader or listener. Despite the small scale of our dataset and limited computational resources, our results indicate that our model effectively learns emotion-relevant features in music and incorporates them into its generation process, producing samples with diverse emotional responses. We evaluate the generated outputs using objective musical metrics and a human listening study, confirming the model's ability to capture intended emotional cues.

en cs.SD, cs.AI
arXiv Open Access 2025
TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling

Seungheon Doh, Keunwoo Choi, Juhan Nam

While the recent developments in large language models (LLMs) have successfully enabled generative recommenders with natural language interactions, their recommendation behavior is limited, leaving other simpler yet crucial components such as metadata or attribute filtering underutilized in the system. We propose an LLM-based music recommendation system with tool calling to serve as a unified retrieval-reranking pipeline. Our system positions an LLM as an end-to-end recommendation system that interprets user intent, plans tool invocations, and orchestrates specialized components: boolean filters (SQL), sparse retrieval (BM25), dense retrieval (embedding similarity), and generative retrieval (semantic IDs). Through tool planning, the system predicts which types of tools to use, their execution order, and the arguments needed to find music matching user preferences, supporting diverse modalities while seamlessly integrating multiple database filtering methods. We demonstrate that this unified tool-calling framework achieves competitive performance across diverse recommendation scenarios by selectively employing appropriate retrieval methods based on user queries, envisioning a new paradigm for conversational music recommendation systems.

en cs.IR, cs.MM
arXiv Open Access 2025
CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning

Angelos-Nikolaos Kanatas, Charilaos Papaioannou, Alexandros Potamianos

Recent advances in music foundation models have improved audio representation learning, yet their effectiveness across diverse musical traditions remains limited. We introduce CultureMERT-95M, a multi-culturally adapted foundation model developed to enhance cross-cultural music representation learning and understanding. To achieve this, we propose a two-stage continual pre-training strategy that integrates learning rate re-warming and re-decaying, enabling stable adaptation even with limited computational resources. Training on a 650-hour multi-cultural data mix, comprising Greek, Turkish, and Indian music traditions, results in an average improvement of 4.9% in ROC-AUC and AP across diverse non-Western music auto-tagging tasks, surpassing prior state-of-the-art, with minimal forgetting on Western-centric benchmarks. We further investigate task arithmetic, an alternative approach to multi-cultural adaptation that merges single-culture adapted models in the weight space. Task arithmetic performs on par with our multi-culturally trained model on non-Western auto-tagging tasks and shows no regression on Western datasets. Cross-cultural evaluation reveals that single-culture models transfer with varying effectiveness across musical traditions, whereas the multi-culturally adapted model achieves the best overall performance. To support research on world music representation learning, we publicly release CultureMERT-95M and CultureMERT-TA-95M, fostering the development of more culturally aware music foundation models.

en cs.SD, cs.AI
arXiv Open Access 2025
ReMi: A Random Recurrent Neural Network Approach to Music Production

Hugo Chateau-Laurent, Tara Vanhatalo, Wei-Tung Pan et al.

Generative artificial intelligence raises concerns related to energy consumption, copyright infringement and creative atrophy. We show that randomly initialized recurrent neural networks can produce arpeggios and low-frequency oscillations that are rich and configurable. In contrast to end-to-end music generation that aims to replace musicians, our approach expands their creativity while requiring no data and much less computational power. More information can be found at: https://allendia.com/

en cs.SD, cs.AI
arXiv Open Access 2025
Evaluating Multimodal Large Language Models on Core Music Perception Tasks

Brandon James Carone, Iran R. Roman, Pablo Ripollés

Multimodal Large Language Models (LLMs) claim "musical understanding" via evaluations that conflate listening with score reading. We benchmark three SOTA LLMs (Gemini 2.5 Pro, Gemini 2.5 Flash, and Qwen2.5-Omni) across three core music skills: Syncopation Scoring, Transposition Detection, and Chord Quality Identification. Moreover, we separate three sources of variability: (i) perceptual limitations (audio vs. MIDI inputs), (ii) exposure to examples (zero- vs. few-shot manipulations), and (iii) reasoning strategies (Standalone, CoT, LogicLM). For the latter we adapt LogicLM, a framework combining LLMs with symbolic solvers to perform structured reasoning, to music. Results reveal a clear perceptual gap: models perform near ceiling on MIDI but show accuracy drops on audio. Reasoning and few-shot prompting offer minimal gains. This is expected for MIDI, where performance reaches saturation, but more surprising for audio, where LogicLM, despite near-perfect MIDI accuracy, remains notably brittle. Among models, Gemini Pro achieves the highest performance across most conditions. Overall, current systems reason well over symbols (MIDI) but do not yet "listen" reliably from audio. Our method and dataset make the perception-reasoning boundary explicit and offer actionable guidance for building robust, audio-first music systems.

en cs.SD, cs.AI
arXiv Open Access 2025
Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis

Lukáš Samuel Marták, Patricia Hu, Gerhard Widmer

Automatic Music Transcription (AMT) -- the task of converting music audio into note representations -- has seen rapid progress, driven largely by deep learning systems. Due to the limited availability of richly annotated music datasets, much of the progress in AMT has been concentrated on classical piano music, and even a few very specific datasets. Whether these systems can generalize effectively to other musical contexts remains an open question. Complementing recent studies on distribution shifts in sound (e.g., recording conditions), in this work we investigate the musical dimension -- specifically, variations in genre, dynamics, and polyphony levels. To this end, we introduce the MDS corpus, comprising three distinct subsets -- (1) Genre, (2) Random, and (3) MAEtest -- to emulate different axes of distribution shift. We evaluate the performance of several state-of-the-art AMT systems on the MDS corpus using both traditional information-retrieval and musically-informed performance metrics. Our extensive evaluation isolates and exposes varying degrees of performance degradation under specific distribution shifts. In particular, we measure a note-level F1 performance drop of 20 percentage points due to sound, and 14 due to genre. Generally, we find that dynamics estimation proves more vulnerable to musical variation than onset prediction. Musically informed evaluation metrics, particularly those capturing harmonic structure, help identify potential contributing factors. Furthermore, experiments with randomly generated, non-musical sequences reveal clear limitations in system performance under extreme musical distribution shifts. Altogether, these findings offer new evidence of the persistent impact of the Corpus Bias problem in deep AMT systems.

en cs.SD, cs.LG
DOAJ Open Access 2024
Korespondencja Aleksandra Tansmana z Krzysztofem Biegańskim w świetle recepcji jego muzyki w Polsce w latach międzywojennych

Zofia Helman

Aleksander Tansman (1897–1986), kompozytor pochodzący z rodziny żydowskiej, urodzony i wykształcony w Polsce, wyjechał za granicę pod koniec 1919 po uzyskaniu nagrody i dwóch wyróżnień na pierwszym konkursie kompozytorskim w powojennej Polsce. Zamieszkał w Paryżu, gdzie szybko wszedł w tamtejsze środowisko muzyczne i stopniowo zyskiwał popularność. W latach 20. i 30. odbył dwukrotnie tournée koncertowe po Stanach Zjednoczonych a w 1932 tournée prowadzące przez 4 kontynenty, w czasie którego występował jako kompozytor, dyrygent i pianista. W przeciwieństwie do tych sukcesów w kraju rodzinnym jego utwory były rzadko wykonywane, a polscy krytycy o nastawieniu konserwatywnym nader nieprzychylnie komentowali jego twórczość. Dwukrotnie odwiedził Polskę: w 1932 i 1936. Zmiany polityczne w Polsce w latach 30. i wzrost nastrojów antysemickich spowodowały jednak, że w 1938 Tansman przyjął obywatelstwo francuskie. W czasie II wojny światowej i w pierwszym dziesięcioleciu powojennym jego kontakty ze środowiskiem polskim zostały przerwane. Władze PRL traktowały emigrantów jako osoby wrogie ustrojowi i nie należące do kultury polskiej. Toteż dopiero w 1958 nawiązali z Tansmanem kontakt dyrygent Stanisław Wisłocki (1921–1998) i muzykolog Krzysztof Biegański (1936–1967), który opublikował o nim kilka artykułów i w dużej mierze przyczynił się do pogłębienia znajomości jego dzieł w Polsce i uznania jego znaczenia w rozwoju nowej muzyki. Korespondencja Tansmana z Biegańskim z lat 1959–1961 stanowi drugą część artykułu.

Literature on music, Music
arXiv Open Access 2024
Controlling Surprisal in Music Generation via Information Content Curve Matching

Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer

In recent years, the quality and public interest in music generation systems have grown, encouraging research into various ways to control these systems. We propose a novel method for controlling surprisal in music generation using sequence models. To achieve this goal, we define a metric called Instantaneous Information Content (IIC). The IIC serves as a proxy function for the perceived musical surprisal (as estimated from a probabilistic model) and can be calculated at any point within a music piece. This enables the comparison of surprisal across different musical content even if the musical events occur in irregular time intervals. We use beam search to generate musical material whose IIC curve closely approximates a given target IIC. We experimentally show that the IIC correlates with harmonic and rhythmic complexity and note density. The correlation decreases with the length of the musical context used for estimating the IIC. Finally, we conduct a qualitative user study to test if human listeners can identify the IIC curves that have been used as targets when generating the respective musical material. We provide code for creating IIC interpolations and IIC visualizations on https://github.com/muthissar/iic.

en cs.SD, cs.AI
arXiv Open Access 2024
Semi-Supervised Contrastive Learning of Musical Representations

Julien Guinot, Elio Quinton, György Fazekas

Despite the success of contrastive learning in Music Information Retrieval, the inherent ambiguity of contrastive self-supervision presents a challenge. Relying solely on augmentation chains and self-supervised positive sampling strategies can lead to a pretraining objective that does not capture key musical information for downstream tasks. We introduce semi-supervised contrastive learning (SemiSupCon), a simple method for leveraging musically informed labeled data (supervision signals) in the contrastive learning of musical representations. Our approach introduces musically relevant supervision signals into self-supervised contrastive learning by combining supervised and self-supervised contrastive objectives in a simpler framework than previous approaches. This framework improves downstream performance and robustness to audio corruptions on a range of downstream MIR tasks with moderate amounts of labeled data. Our approach enables shaping the learned similarity metric through the choice of labeled data that (1) infuses the representations with musical domain knowledge and (2) improves out-of-domain performance with minimal general downstream performance loss. We show strong transfer learning performance on musically related yet not trivially similar tasks - such as pitch and key estimation. Additionally, our approach shows performance improvement on automatic tagging over self-supervised approaches with only 5\% of available labels included in pretraining.

en eess.AS
arXiv Open Access 2024
Towards Musically Informed Evaluation of Piano Transcription Models

Patricia Hu, Lukáš Samuel Marták, Carlos Cancino-Chacón et al.

Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent years, MAESTRO has become the de-facto training and evaluation dataset for such models. However, inference performance has been observed to deteriorate substantially when applied on out-of-distribution data, thereby questioning the suitability and reliability of transcribed outputs from such models for specific MIR tasks. In this work, we investigate the performance of three state-of-the-art piano transcription models in two experiments. In the first one, we propose a variety of musically informed evaluation metrics which, in contrast to the IR metrics, offer more detailed insight into the musical quality of the transcriptions. In the second experiment, we compare inference performance on real-world and perturbed audio recordings, and highlight musical dimensions which our metrics can help explain. Our experimental results highlight the weaknesses of existing piano transcription metrics and contribute to a more musically sound error analysis of transcription outputs.

en cs.SD, eess.AS
DOAJ Open Access 2023
Edino Krieger

Ermelinda Paz

Este artigo tem como objetivo atualizar a biografia de Edino Krieger – crítico, compositor e produtor musical – publicada em dois volumes em 2012, pelo selo SESC Nacional. Em 2014, o compositor Ronaldo Miranda, em sua resenha crítica sobre a citada obra para a Revista da Academia Brasileira de Música a considerou como: “Uma biografia singular para um compositor plural”. A partir de então, nos propusemos a alterar o seu status de biografia desatualizada para biografia atualizada, mas reconhecendo que o potencial do conjunto de realizações do compositor, aliado à grandeza e à projeção natural de seus feitos motivariam o surgimento de novos fatos, revisões e inserções. O artigo pretende atualizar a biografia, em especial, não só no que tange às novas obras, mas também levantar as respostas da comunidade acadêmica através de novas gravações, artigos, dissertações e, ainda, teses. No Vol. II da biografia, nas páginas 58 - 59 há uma menção ao Projeto Bem-me-quer Paquetá. Por fim, é intenção do artigo trazer à baila através da análise minuciosa desse projeto, o Edino Krieger, Educador Musical, cuja importância nos pareceu relegada pelo próprio compositor a um plano menor.

Literature on music, Music
arXiv Open Access 2023
IteraTTA: An interface for exploring both text prompts and audio priors in generating music with text-to-audio models

Hiromu Yakura, Masataka Goto

Recent text-to-audio generation techniques have the potential to allow novice users to freely generate music audio. Even if they do not have musical knowledge, such as about chord progressions and instruments, users can try various text prompts to generate audio. However, compared to the image domain, gaining a clear understanding of the space of possible music audios is difficult because users cannot listen to the variations of the generated audios simultaneously. We therefore facilitate users in exploring not only text prompts but also audio priors that constrain the text-to-audio music generation process. This dual-sided exploration enables users to discern the impact of different text prompts and audio priors on the generation results through iterative comparison of them. Our developed interface, IteraTTA, is specifically designed to aid users in refining text prompts and selecting favorable audio priors from the generated audios. With this, users can progressively reach their loosely-specified goals while understanding and exploring the space of possible results. Our implementation and discussions highlight design considerations that are specifically required for text-to-audio models and how interaction techniques can contribute to their effectiveness.

en eess.AS, cs.AI
arXiv Open Access 2023
Modeling Bends in Popular Music Guitar Tablatures

Alexandre D'Hooge, Louis Bigo, Ken Déguernel

Tablature notation is widely used in popular music to transcribe and share guitar musical content. As a complement to standard score notation, tablatures transcribe performance gesture information including finger positions and a variety of guitar-specific playing techniques such as slides, hammer-on/pull-off or bends.This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard. In this paper, we propose a set of 25 high-level features, computed for each note of the tablature, to study how bend occurrences can be predicted from their past and future short-term context. Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions, demonstrating promising applications to assist the arrangement of non-guitar music into guitar tablatures.

en cs.SD, cs.AI
DOAJ Open Access 2022
Reseña de producción discográfica de música y tecnología editada por el Laboratorio de Música Electroacústica y Arte Sonoro de la Universidad Nacional de Música

Isaac Grados García

A lo largo de este año 2022, la Universidad Nacional de Música ha realizado dos publicaciones fonográficas, como parte de las actividades que realiza la Vicepresidencia de Investigación de esta institución, por medio de la Dirección de Innovación y Transferencia Tecnológica y el Laboratorio de Música Electroacústica y Arte Sonoro. Estos discos cuentan con la participación de estudiantes de esta universidad, y representan una mirada a una nueva generación de compositores y sus obras.

Music, Musical instruction and study

Halaman 10 dari 44416