Hasil untuk "Music and books on Music"

Menampilkan 20 dari ~888327 hasil · dari arXiv, DOAJ, CrossRef

JSON API
arXiv Open Access 2025
Diffusion-based Symbolic Music Generation with Structured State Space Models

Shenghua Yuan, Xing Tang, Jiatao Chen et al.

Recent advancements in diffusion models have significantly improved symbolic music generation. However, most approaches rely on transformer-based architectures with self-attention mechanisms, which are constrained by quadratic computational complexity, limiting scalability for long sequences. To address this, we propose Symbolic Music Diffusion with Mamba (SMDIM), a novel diffusion-based architecture integrating Structured State Space Models (SSMs) for efficient global context modeling and the Mamba-FeedForward-Attention Block (MFA) for precise local detail preservation. The MFA Block combines the linear complexity of Mamba layers, the non-linear refinement of FeedForward layers, and the fine-grained precision of self-attention mechanisms, achieving a balance between scalability and musical expressiveness. SMDIM achieves near-linear complexity, making it highly efficient for long-sequence tasks. Evaluated on diverse datasets, including FolkDB, a collection of traditional Chinese folk music that represents an underexplored domain in symbolic music generation, SMDIM outperforms state-of-the-art models in both generation quality and computational efficiency. Beyond symbolic music, SMDIM's architectural design demonstrates adaptability to a broad range of long-sequence generation tasks, offering a scalable and efficient solution for coherent sequence modeling.

en cs.SD
arXiv Open Access 2025
Explicit Tonal Tension Conditioning via Dual-Level Beam Search for Symbolic Music Generation

Maral Ebrahimzadeh, Gilberto Bernardes, Sebastian Stober

State-of-the-art symbolic music generation models have recently achieved remarkable output quality, yet explicit control over compositional features, such as tonal tension, remains challenging. We propose a novel approach that integrates a computational tonal tension model, based on tonal interval vector analysis, into a Transformer framework. Our method employs a two-level beam search strategy during inference. At the token level, generated candidates are re-ranked using model probability and diversity metrics to maintain overall quality. At the bar level, a tension-based re-ranking is applied to ensure that the generated music aligns with a desired tension curve. Objective evaluations indicate that our approach effectively modulates tonal tension, and subjective listening tests confirm that the system produces outputs that align with the target tension. These results demonstrate that explicit tension conditioning through a dual-level beam search provides a powerful and intuitive tool to guide AI-generated music. Furthermore, our experiments demonstrate that our method can generate multiple distinct musical interpretations under the same tension condition.

en cs.SD, cs.AI
arXiv Open Access 2025
MUST-RAG: MUSical Text Question Answering with Retrieval Augmented Generation

Daeyong Kwon, SeungHeon Doh, Juhan Nam

Recent advancements in Large language models (LLMs) have demonstrated remarkable capabilities across diverse domains. While they exhibit strong zero-shot performance on various tasks, LLMs' effectiveness in music-related applications remains limited due to the relatively small proportion of music-specific knowledge in their training data. To address this limitation, we propose MusT-RAG, a comprehensive framework based on Retrieval Augmented Generation (RAG) to adapt general-purpose LLMs for text-only music question answering (MQA) tasks. RAG is a technique that provides external knowledge to LLMs by retrieving relevant context information when generating answers to questions. To optimize RAG for the music domain, we (1) propose MusWikiDB, a music-specialized vector database for the retrieval stage, and (2) utilizes context information during both inference and fine-tuning processes to effectively transform general-purpose LLMs into music-specific models. Our experiment demonstrates that MusT-RAG significantly outperforms traditional fine-tuning approaches in enhancing LLMs' music domain adaptation capabilities, showing consistent improvements across both in-domain and out-of-domain MQA benchmarks. Additionally, our MusWikiDB proves substantially more effective than general Wikipedia corpora, delivering superior performance and computational efficiency.

en cs.CL, cs.AI
DOAJ Open Access 2025
Trilha sonora-musical adaptativa: um estudo bibliográfico sobre música para videogames

Marcio Guedes Correa, Gabrielle Delorence Di Santo

Este estudo bibliográfico pretende estabelecer um panorama sobre as práticas utilizadas na música adaptativa para trilhas sonoras de videogames. A música adaptativa, diferentemente das trilhas sonoras-musicais estáticas tradicionais, ajusta-se em tempo real para refletir os eventos, os ambientes e as ações do jogador no jogo, criando uma experiência auditiva personalizada. Os principais métodos discutidos incluem re-sequenciamento horizontal, camadas verticais e mixagem dinâmica, cada um permitindo transições perfeitas e variações de tom, ritmo e harmonia. Essas técnicas são complementadas por processos algorítmicos para adaptar elementos musicais, como andamento, timbre e leitmotivs, a diversos cenários de jogo. A abordagem teórica enfatiza a interação entre o áudio diegético e não diegético, bem como a integração do design centrado no usuário para garantir a interação intuitiva entre a música e a mecânica do jogo. O trabalho também aborda os desafios para alcançar a coerência em narrativas não lineares, mostrando como as estratégias adaptativas contribuem para a profundidade emocional e o alinhamento narrativo.

Musical instruction and study
DOAJ Open Access 2025
Recommender Systems for Unified Modeling Language and Vice Versa—A Systematic Literature Review

Elaheh Azadi Marand, Amir Sheikhahmadi, Moharram Challenger et al.

Recommender systems (RSs) are fundamental tools that address data redundancy and serve as intelligent supplements for tasks such as data retrieval and refinement by analyzing user behavior. Nowadays, RSs are utilized in various domains, ranging from filtering web news based on user preferences to recommending movies, music, books, and articles in e-commerce. Additionally, these systems are extensively employed to facilitate software engineering activities, including modeling. Modeling environments are enriched with RSs that assist in building models by providing recommendations based on previous solutions to similar problems within the same domain. Consequently, there is growing research interest in approaches that aid the modeling process. This paper presents a systematic literature review (SLR) that analyzes how recommender systems techniques are used to suggest UML diagrams, as well as the role of UML diagrams in describing recommender systems. In addition, it discusses methods for evaluating primary studies, the challenges that primary studies have addressed, and the domains of study that primary studies have targeted (based on an analysis of 4789 papers). We believe this study will guide researchers and professionals in identifying recommender system techniques for generating UML diagram suggestions and understanding the overall purpose of using UML diagrams. Furthermore, it may contribute to a broader understanding of the research process and inspire future research on recommender system techniques within other modeling languages. The results show that 45% of the studies use content-based techniques to suggest UML diagrams, with 77% of the recommendations being structural diagrams (such as class diagrams). On the other hand, to design the components of the proposed approaches (recommender systems), behavioral diagrams are generally used (53% on average), focusing on knowledge-based techniques (28% on average). Finally, the study shows that researchers use content-based (38%) and knowledge-based (41%) techniques to recommend design models. The analysis revealed that the following challenges were identified: 19 studies dealt with the cold start problem, 20 studies with sparsity issues, 11 studies with scalability concerns, 3 studies with diversity challenges, and 12 studies with other types of challenges.

Electrical engineering. Electronics. Nuclear engineering
DOAJ Open Access 2025
An Unknown Letter of Paul Siefert and his Activities until 1611

Marcin Szelest

The article presents a hitherto unknown letter of Paul Siefert, written in Amsterdam in December 1608. Based on its contents and other source documentation, a timeline of Siefert’s activities up to 1611 has been detailed, including the events surrounding the competition for the post of organist of St. Mary’s Church in Gdańsk. The study concludes with hypotheses concerning the composer’s education before 1607.

Literature on music, Music
arXiv Open Access 2024
The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Alessandro Ilic Mezza, Alberto Bernardini

We present the IEEE-IS2 2024 Music Packet Loss Concealment Challenge. We begin by detailing the challenge rules, followed by an overview of the provided baseline system, the blind test set, and the evaluation methodology used to determine the final ranking. This inaugural edition aimed to foster collaboration between researchers and practitioners from the fields of signal processing, machine learning, and networked music performance, while also laying the groundwork for future advancements in packet loss concealment for music signals.

en eess.AS, cs.SD
arXiv Open Access 2024
Music Style Transfer With Diffusion Model

Hong Huang, Yuyi Wang, Luyao Li et al.

Previous studies on music style transfer have mainly focused on one-to-one style conversion, which is relatively limited. When considering the conversion between multiple styles, previous methods required designing multiple modes to disentangle the complex style of the music, resulting in large computational costs and slow audio generation. The existing music style transfer methods generate spectrograms with artifacts, leading to significant noise in the generated audio. To address these issues, this study proposes a music style transfer framework based on diffusion models (DM) and uses spectrogram-based methods to achieve multi-to-multi music style transfer. The GuideDiff method is used to restore spectrograms to high-fidelity audio, accelerating audio generation speed and reducing noise in the generated audio. Experimental results show that our model has good performance in multi-mode music style transfer compared to the baseline and can generate high-quality audio in real-time on consumer-grade GPUs.

en cs.SD, cs.AI
arXiv Open Access 2024
Exploring Transformer-Based Music Overpainting for Jazz Piano Variations

Eleanor Row, Ivan Shanin, György Fazekas

This paper explores transformer-based models for music overpainting, focusing on jazz piano variations. Music overpainting generates new variations while preserving the melodic and harmonic structure of the input. Existing approaches are limited by small datasets, restricting scalability and diversity. We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs. Using a semi-automatic pipeline, we evaluate two transformer configurations on VAR4000, comparing their performance with the smaller JAZZVAR dataset. Preliminary results show promising improvements in generalisation and performance with the larger dataset configuration, highlighting the potential of transformer models to scale effectively for music overpainting on larger and more diverse datasets.

en cs.SD, cs.LG
arXiv Open Access 2024
MidiTok Visualizer: a tool for visualization and analysis of tokenized MIDI symbolic music

Michał Wiszenko, Kacper Stefański, Piotr Malesa et al.

Symbolic music research plays a crucial role in music-related machine learning, but MIDI data can be complex for those without musical expertise. To address this issue, we present MidiTok Visualizer, a web application designed to facilitate the exploration and visualization of various MIDI tokenization methods from the MidiTok Python package. MidiTok Visualizer offers numerous customizable parameters, enabling users to upload MIDI files to visualize tokenized data alongside an interactive piano roll.

en cs.SD, cs.AI
arXiv Open Access 2023
Leveraging Negative Signals with Self-Attention for Sequential Music Recommendation

Pavan Seshadri, Peter Knees

Music streaming services heavily rely on their recommendation engines to continuously provide content to their consumers. Sequential recommendation consequently has seen considerable attention in current literature, where state of the art approaches focus on self-attentive models leveraging contextual information such as long and short-term user history and item features; however, most of these studies focus on long-form content domains (retail, movie, etc.) rather than short-form, such as music. Additionally, many do not explore incorporating negative session-level feedback during training. In this study, we investigate the use of transformer-based self-attentive architectures to learn implicit session-level information for sequential music recommendation. We additionally propose a contrastive learning task to incorporate negative feedback (e.g skipped tracks) to promote positive hits and penalize negative hits. This task is formulated as a simple loss term that can be incorporated into a variety of deep learning architectures for sequential recommendation. Our experiments show that this results in consistent performance gains over the baseline architectures ignoring negative user feedback.

en cs.IR, cs.LG
arXiv Open Access 2023
Collaborative Song Dataset (CoSoD): An annotated dataset of multi-artist collaborations in popular music

Michèle Duguay, Kate Mancey, Johanna Devaney

The Collaborative Song Dataset (CoSoD) is a corpus of 331 multi-artist collaborations from the 2010-2019 Billboard "Hot 100" year-end charts. The corpus is annotated with formal sections, aspects of vocal production (including reverberation, layering, panning, and gender of the performers), and relevant metadata. CoSoD complements other popular music datasets by focusing exclusively on musical collaborations between independent acts. In addition to facilitating the study of song form and vocal production, CoSoD allows for the in-depth study of gender as it relates to various timbral, pitch, and formal parameters in musical collaborations. In this paper, we detail the contents of the dataset and outline the annotation process. We also present an experiment using CoSoD that examines how the use of reverberation, layering, and panning are related to the gender of the artist. In this experiment, we find that men's voices are on average treated with less reverberation and occupy a more narrow position in the stereo mix than women's voices.

en cs.SD, eess.AS
DOAJ Open Access 2023
EXPLORING CHINESE CULTURAL IDENTITY IN THE LIANG ZHU VIOLIN CONCERTO: AN INTERCULTURAL PERSPECTIVE ON THE ADAPTATION OF TRADITIONAL ELEMENTS IN WESTERN CLASSICAL MUSIC LANGUAGE

Maria-Magdalena SUCIU, Stela DRĂGULIN

This article examines the phenomenon of interculturality through the lens of the Liang Zhu Concerto for Violin and Orchestra by He Zhanhao and Chen Gang. Interculturality is no longer merely a means of elevating the axiological value of a given context but has become a necessity for authenticating contemporary discourse. The role of interculturality in shaping the expression of creative intentions is amplified, as it attenuates divergences determined by the incongruity of individuals’ backgrounds by comprehensively observing the uniqueness of foreign elements from a familiarity-based perspective. The Liang Zhu Violin Concerto exemplifies the adaptation of East Asian culture to the context of the Western language and means of expression while preserving its Chinese cultural identity. This concerto has significant value and desirability for consumption due to the proportion of originality and familiarity which it upholds and determines its overall appeal. Ultimately, this article aims to explore how the Liang Zhu Violin Concerto achieves originality at a global level while preserving its Chinese cultural identity.

arXiv Open Access 2021
An Interdisciplinary Review of Music Performance Analysis

Alexander Lerch, Claire Arthur, Ashis Pati et al.

A musical performance renders an acoustic realization of a musical score or other representation of a composition. Different performances of the same composition may vary in terms of performance parameters such as timing or dynamics, and these variations may have a major impact on how a listener perceives the music. The analysis of music performance has traditionally been a peripheral topic for the MIR research community, where often a single audio recording is used as representative of a musical work. This paper surveys the field of Music Performance Analysis (MPA) from several perspectives including the measurement of performance parameters, the relation of those parameters to the actions and intentions of a performer or perceptual effects on a listener, and finally the assessment of musical performance. This paper also discusses MPA as it relates to MIR, pointing out opportunities for collaboration and future research in both areas.

en cs.SD, cs.DL
arXiv Open Access 2020
Quantifying Musical Style: Ranking Symbolic Music based on Similarity to a Style

Jeff Ens, Philippe Pasquier

Modelling human perception of musical similarity is critical for the evaluation of generative music systems, musicological research, and many Music Information Retrieval tasks. Although human similarity judgments are the gold standard, computational analysis is often preferable, since results are often easier to reproduce, and computational methods are much more scalable. Moreover, computation based approaches can be calculated quickly and on demand, which is a prerequisite for use with an online system. We propose StyleRank, a method to measure the similarity between a MIDI file and an arbitrary musical style delineated by a collection of MIDI files. MIDI files are encoded using a novel set of features and an embedding is learned using Random Forests. Experimental evidence demonstrates that StyleRank is highly correlated with human perception of stylistic similarity, and that it is precise enough to rank generated samples based on their similarity to the style of a corpus. In addition, similarity can be measured with respect to a single feature, allowing specific discrepancies between generated samples and a particular musical style to be identified.

en eess.AS, cs.SD
arXiv Open Access 2020
The Impact of Label Noise on a Music Tagger

Katharina Prinz, Arthur Flexer, Gerhard Widmer

We explore how much can be learned from noisy labels in audio music tagging. Our experiments show that carefully annotated labels result in highest figures of merit, but even high amounts of noisy labels contain enough information for successful learning. Artificial corruption of curated data allows us to quantize this contribution of noisy labels.

en eess.AS, cs.LG
DOAJ Open Access 2020
Fame, Obscurity and Power Laws in Music History

Andrew Gustar

This paper investigates the processes leading to musical fame or obscurity, whether for composers, performers, or works themselves. It starts from the observation that the patterns of success, across many historical music datasets, follow a similar mathematical relationship known as a power law, often with an exponent approximately equal to two. It presents several simple models which can produce power law distributions. An examination of these models' transience characteristics suggests parallels with some historical music examples, giving clues to the ways that success and obscurity might emerge in practice and the extent to which success might be influenced by inherent musical quality. These models can be seen as manifestations of a more fundamental process resulting from the law of maximum entropy, subject to a constraint on the average value of the logarithm of the success measure. This implies that musical success is a multiplicative quality, and suggests that musical markets operate to strike a balance between familiarity (socio-cultural importance) and novelty (individual importance). The common power law exponent of two is seen to emerge as a consequence of the tendency for musical activity to be spread evenly across the log-success bands.

DOAJ Open Access 2020
Parsimonious relations between Guitar Textural Proposals

Bernardo Ramos Pinto, Pauxy Gentil-Nunes

Guitar Textural Analysis is an analytical proposal devised for the composition for the guitar that exposes the relations between the textural structure and the technical procedures used in the instrumental performance. The Guitar Textural Proposals (GTPs) are the specific configurations extracted in this analytical process. In the present work, the focus remains on recognizing parsimonious movements between GTPs, that is, minimal distinctions between patterns of use of the instrument combined with respective textural configurations. Parsimoniousness is defined in the context of Developing Variation, Neo-Riemannian Theory, and Partitional Analysis. Previous works by Fabio Adour and authors Sérgio Freire and Pedro Cambraia present perspectives on the relation between texture and guitar performance. Leo Brouwer’s series of progressive pieces Études Simples is adopted as a reference. The work results in a proposal for a ordered network of technical-textural configurations connected by parsimonious relations.

Music and books on Music, Music

Halaman 11 dari 44417