Hasil "Musical instruction and study"

arXiv Open Access 2026

Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation

Jinhan Xu, Xing Tang, Houpeng Yang et al.

Symbolic music generation is a challenging task in multimedia generation, involving long sequences with hierarchical temporal structures, long-range dependencies, and fine-grained local details. Though recent diffusion-based models produce high quality generations, they tend to suffer from high training and inference costs with long symbolic sequences due to iterative denoising and sequence-length-related costs. To deal with such problem, we put forth a diffusing strategy named SMDIM to combine efficient global structure construction and light local refinement. SMDIM uses structured state space models to capture long range musical context at near linear cost, and selectively refines local musical details via a hybrid refinement scheme. Experiments performed on a wide range of symbolic music datasets which encompass various Western classical music, popular music and traditional folk music show that the SMDIM model outperforms the other state-of-the-art approaches on both the generation quality and the computational efficiency, and it has robust generalization to underexplored musical styles. These results show that SMDIM offers a principled solution for long-sequence symbolic music generation, including associated attributes that accompany the sequences. We provide a project webpage with audio examples and supplementary materials at https://3328702107.github.io/smdim-music/.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2026

Musical Training, but not Mere Exposure to Music, Drives the Emergence of Chroma Equivalence in Artificial Neural Networks

Lukas Grasse, Matthew S. Tata

Pitch is a fundamental aspect of auditory perception. Pitch perception is commonly described across two perceptual dimensions: pitch height is the sense that tones with varying frequencies seem to be higher or lower, and chroma equivalence is the cyclical similarity of notes octaves, corresponding to a doubling of fundamental frequency. Existing research is divided on whether chroma equivalence is a learned percept that varies according to musical experience and culture, or is an innate percept that develops automatically. Building on a recent framework that proposes to use ANNs to ask 'why' questions about the brain, we evaluated recent auditory ANNs using representational similarity analysis to test the emergence of pitch height and chroma equivalence in their learned representations. Additionally, we fine-tuned two models, Wav2Vec 2.0 and Data2Vec, on a self-supervised learning task using speech and music, and a supervised music transcription task. We found that all models exhibited varying degrees of pitch height representation, but that only models trained on the supervised music transcription task exhibited chroma equivalence. Mere exposure to music through self-supervised learning was not sufficient for chroma equivalence to emerge. This supports the view that chroma equivalence is a higher-order cognitive computation that emerges to support the specific task of music perception, distinct from other auditory perception such as speech listening. This work also highlights the usefulness of ANNs for probing the developmental conditions that give rise to perceptual representations in humans.

en cs.SD, cs.NE

Detail Sumber

CrossRef Open Access 2026

Correction to: A Study of the Replication and Restoration of Chinese Musical Relics and Artefacts

Zichu Wang

en

Detail DOI Sumber

CrossRef Open Access 2026

Feasibility Study of Instruction-Level Pipelining within the Ethereum Virtual Machine Architecture

Gopal Ojha

The Ethereum Virtual Machine (EVM) is a stack-based virtual processor that executes smart contract bytecode sequentially. While this design ensures determinism and correctness, it inherently limits instruction throughput. This paper presents a feasibility study of instruction-level pipelining within the EVM interpreter architecture. By analyzing the internal execution flow of the EVM as implemented in the Go-Ethereum (geth) client, the study identifies the program counter dependency, particularly under jump instructions, as the principal control hazard preventing naïve pipelining. A two-stage pipelined execution model is proposed, separating opcode fetch and decode from execution and program counter update, with a feedback mechanism to preserve EVM semantics. The work focuses on architectural feasibility rather than performance evaluation and optimization, demonstrating that pipelining inside the EVM interpreter is conceptually possible under controlled synchronization. Limitations, design challenges, and future research directions are discussed.

en

Detail DOI Sumber

arXiv Open Access 2025

CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation

Zhejing Hu, Yan Liu, Gong Chen et al.

Generative artificial intelligence in music has made significant strides, yet it still falls short of the substantial achievements seen in natural language processing, primarily due to the limited availability of music data. Knowledge-informed approaches have been shown to enhance the performance of music generation models, even when only a few pieces of musical knowledge are integrated. This paper seeks to leverage comprehensive music theory in AI-driven music generation tasks, such as algorithmic composition and style transfer, which traditionally require significant manual effort with existing techniques. We introduce a novel automatic music lexicon construction model that generates a lexicon, named CompLex, comprising 37,432 items derived from just 9 manually input category keywords and 5 sentence prompt templates. A new multi-agent algorithm is proposed to automatically detect and mitigate hallucinations. CompLex demonstrates impressive performance improvements across three state-of-the-art text-to-music generation models, encompassing both symbolic and audio-based methods. Furthermore, we evaluate CompLex in terms of completeness, accuracy, non-redundancy, and executability, confirming that it possesses the key characteristics of an effective lexicon.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2024

Polyrhythmic Harmonies from the Sky: Transforming Satellite Images of Clouds into Musical Compositions through Algorithms

Carlos Darío Badilla Cerdas

In a context of increasing scientific specialization and deficiencies in the scientific literacy of the population, there arises a need to broaden the methods of scientific dissemination. This study proposes an approach that combines music with scientific concepts, focusing on the sonification of satellite images as the core. A generative musical composition system is developed that uses visual data to create accessible and emotional auditory experiences, thus enriching the fields of scientific dissemination and artistic expression. It concludes with an example of the algorithm's use in a musical composition.

en physics.soc-ph

Detail Sumber

arXiv Open Access 2024

RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Wanlong Liu, Junying Chen, Ke Ji et al.

Retrieval-Augmented Generation (RAG) has emerged as a key paradigm for enhancing large language models (LLMs) by incorporating external knowledge. However, current RAG methods face two limitations: (1) they only cover limited RAG scenarios. (2) They suffer from limited task diversity due to the lack of a general RAG dataset. To address these limitations, we propose RAG-Instruct, a general method for synthesizing diverse and high-quality RAG instruction data based on any source corpus. Our approach leverages (1) five RAG paradigms, which encompass diverse query-document relationships, and (2) instruction simulation, which enhances instruction diversity and quality by utilizing the strengths of existing instruction datasets. Using this method, we construct a 40K instruction dataset from Wikipedia, comprehensively covering diverse RAG scenarios and tasks. Experiments demonstrate that RAG-Instruct effectively enhances LLMs' RAG capabilities, achieving strong zero-shot performance and significantly outperforming various RAG baselines across a diverse set of tasks. RAG-Instruct is publicly available at https://github.com/FreedomIntelligence/RAG-Instruct.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

MIRFLEX: Music Information Retrieval Feature Library for Extraction

Anuradha Chopra, Abhinaba Roy, Dorien Herremans

This paper introduces an extendable modular system that compiles a range of music feature extraction models to aid music information retrieval research. The features include musical elements like key, downbeats, and genre, as well as audio characteristics like instrument recognition, vocals/instrumental classification, and vocals gender detection. The integrated models are state-of-the-art or latest open-source. The features can be extracted as latent or post-processed labels, enabling integration into music applications such as generative music, recommendation, and playlist generation. The modular design allows easy integration of newly developed systems, making it a good benchmarking and comparison tool. This versatile toolkit supports the research community in developing innovative solutions by providing concrete musical features.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2024

Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning

Fang-Duo Tsai, Shih-Lun Wu, Haven Kim et al.

Text-to-music models allow users to generate nearly realistic musical audio with textual commands. However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. We utilize AudioMAE to extract features from the input audio, and construct attention-based adapters to feedthese features into the internal layers of AudioLDM2, a diffusion-based text-to-music model. With 22M trainable parameters, AP-Adapter empowers users to harness both global (e.g., genre and timbre) and local (e.g., melody) aspects of music, using the original audio and a short text as inputs. Through objective and subjective studies, we evaluate AP-Adapter on three tasks: timbre transfer, genre transfer, and accompaniment generation. Additionally, we demonstrate its effectiveness on out-of-domain audios containing unseen instruments during training.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2024

Discovery of Endianness and Instruction Size Characteristics in Binary Programs from Unknown Instruction Set Architectures

Joachim Andreassen, Donn Morrison

We study the problem of streamlining reverse engineering (RE) of binary programs from unknown instruction set architectures (ISA). We focus on two fundamental ISA characteristics to beginning the RE process: identification of endianness and whether the instruction width is a fixed or variable. For ISAs with a fixed instruction width, we also present methods for estimating the width. In addition to advancing research in software RE, our work can also be seen as a first step in hardware reverse engineering, because endianness and instruction format describe intrinsic characteristics of the underlying ISA. We detail our efforts at feature engineering and perform experiments using a variety of machine learning models on two datasets of architectures using Leave-One-Group-Out-Cross-Validation to simulate conditions where the tested ISA is unknown during model training. We use bigram-based features for endianness detection and the autocorrelation function, commonly used in signal processing applications, for differentiation between fixed- and variable-width instruction sizes. A collection of classifiers from the machine learning library scikit-learn are used in the experiments to research these features. Initial results are promising, with accuracy of endianness detection at 99.4%, fixed- versus variable-width instruction size at 86.0%, and detection of fixed instruction sizes at 88.0%.

en cs.CR

Detail Sumber

DOAJ Open Access 2023

Relationship Between Intensity of Listening to The Song Diri by Tulus and Self-Acceptance in The Solo Tulus Friends Community

Firratu Tsaqifa, Afia Fitriani

Self-acceptance is an essential component of psychological well-being, and music or songs have their potential to contribute to mental health and well-being. This research aims to determine the influence of the intensity of listening to the song Diri by Tulus on self-acceptance in the Teman Tulus Solo Community. The research method used is quantitative with person-product moment correlation. The research population is members of the Teman Tulus Solo Community who follow the official Instagram account @temantulussolo and live in Solo Raya (Surakarta, Boyolali, Sukoharjo, Karanganyar, Wonogiri, Sragen, and Klaten). The sampling technique uses snowball sampling. The total sample size was 102 respondents, consisting of 89 women and 13 men. The instruments in this research were the intensity scale for listening to self-songs (α=0.812) and the Berger Self-Acceptance Scale (α=0.844), distributed online in the form of a Google form to respondents. Based on the correlation test results, there is a significant relationship between the intensity of listening to the song Diri and self-acceptance (p=0.035; r=0.209).

Music, Musical instruction and study

Detail DOI Sumber

DOAJ Open Access 2023

Book Review of Logostan Kurtulmak 20. Yüzyıl Dramında Karşı-Anlatı

Zeynep Erdal

Musical instruction and study, Arts in general

Detail DOI Sumber

DOAJ Open Access 2023

Prospettive di studio della performance musicale nella ricerca artistica

Giusy Caruso

In questo ultimo ventennio, lo studio della performance musicale si sta sviluppando su diversi fronti. All’approccio etnomusicologico, diretto all’analisi della performance nelle tradizioni musicali extraeuropee, si sono affiancati i ‘performance studies’, che hanno portato alla ‘svolta performativa’ e alla ‘svolta artistica’ e, quindi, alla nascita della ricerca artistica musicale incentrata a indagare il processo creativo della performance nella tradizione musicale occidentale. Ma quali sono nello specifico le nuove prospettive di studio della performance musicale rispetto alle domande di ricerca, agli obiettivi, ai metodi e ai risultati della ricerca artistica musicale? Il presente articolo offre una panoramica dei diversi approcci di studio della performance musicale, restringendo il campo ai metodi di analisi della performance di una composizione musicale scritta nell’ambito della ricerca artistica. Nel definire il percorso che determina la trasformazione della pratica musicale in ricerca sulla performance musicale, verranno evidenziate le problematiche relative alla scelta dei metodi, proponendo le divergenze e le convergenze dell’approccio strettamente artistico rispetto all’approccio scientifico. Saranno, quindi, presentati gli obiettivi della ricerca artistica musicale e un metodo misto che integra l’approccio analitico-performativo e l’approccio analitico-empirico. L’applicazione del digitale per la documentazione e l’analisi del gesto del performer sarà tema di discussione per far emergere le potenzialità del dialogo tra arte, scienza e tecnologia, foriero di innovative prospettive per lo studio della performance musicale nella ricerca artistica.

Literature on music, Musical instruction and study

Detail DOI Sumber

DOAJ Open Access 2023

Appropriating “Usûl” in the Tradition of Turkish Folk Music

Onurcan Kaya

Melody and rhythm correspond to makam and usûl, respectively, in Turkish music. Edvars, which are important for historical understanding and the transference of Turkish music, are also crucial for understanding makam and usûl. Furthermore,usûl transferred in written through edvars were transferred through the meşk, which can be evaluated within oral tradition and perform multiple functions. Pedagogical, determining the formal structure and rhythmic structure to be performed in composition process. With the emergence of the Turkish classical/folk music distinction under the influence of the Musical Revolution, the current usûl understanding occurred in Turkish classical music. Following studies conducted within this period, the formulation of a Turkish folk music theory began, and such studies used the concept of usûl. usûl understanding, which was put forward by Muzaffer Sarısözen through the determination of measures, consists of an interpretation that explains the double and triple beats that comprise the measures instead of stereotyped measures with special names and beats similar to the existing usûls. In this sense, the current study infers that Turkish folk music usûl was appropriated from Turkish classical music. It was intended to create the impression that Turkish folk music theory, whose creation was initiated during the Republic period (1920s – 1930s), was a tradition with an ancient past. In this manner, in the context of the ideology of the period, the musical tradition was legitimized more quickly and spread to society. The current study elucidates the use of usûl in Turkish folk music and associates it with the concept of appropriation.

Musical instruction and study, Arts in general

Detail DOI Sumber

arXiv Open Access 2023

Are Words Enough? On the semantic conditioning of affective music generation

Jorge Forero, Gilberto Bernardes, Mónica Mendes

Music has been commonly recognized as a means of expressing emotions. In this sense, an intense debate emerges from the need to verbalize musical emotions. This concern seems highly relevant today, considering the exponential growth of natural language processing using deep learning models where it is possible to prompt semantic propositions to generate music automatically. This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. To address this topic, we propose a historical perspective that encompasses the different disciplines and methods contributing to this topic. In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models. Of note are the deep learning architectures that aim to generate high-fidelity music from textual descriptions. These models raise fundamental questions about the expressivity of music, including whether emotions can be represented with words or expressed through them. We conclude that overcoming the limitation and ambiguity of language to express emotions through music, some of the use of deep learning with natural language has the potential to impact the creative industries by providing powerful tools to prompt and generate new musical works.

en cs.MM, cs.LG

Detail Sumber

arXiv Open Access 2023

Pitchclass2vec: Symbolic Music Structure Segmentation with Chord Embeddings

Nicolas Lazzari, Andrea Poltronieri, Valentina Presutti

Structure perception is a fundamental aspect of music cognition in humans. Historically, the hierarchical organization of music into structures served as a narrative device for conveying meaning, creating expectancy, and evoking emotions in the listener. Thereby, musical structures play an essential role in music composition, as they shape the musical discourse through which the composer organises his ideas. In this paper, we present a novel music segmentation method, pitchclass2vec, based on symbolic chord annotations, which are embedded into continuous vector representations using both natural language processing techniques and custom-made encodings. Our algorithm is based on long-short term memory (LSTM) neural network and outperforms the state-of-the-art techniques based on symbolic chord annotations in the field.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2023

Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem

Emmanouil Karystinaios, Francesco Foscarin, Gerhard Widmer

This paper targets the perceptual task of separating the different interacting voices, i.e., monophonic melodic streams, in a polyphonic musical piece. We target symbolic music, where notes are explicitly encoded, and model this task as a Multi-Trajectory Tracking (MTT) problem from discrete observations, i.e., notes in a pitch-time space. Our approach builds a graph from a musical piece, by creating one node for every note, and separates the melodic trajectories by predicting a link between two notes if they are consecutive in the same voice/stream. This kind of local, greedy prediction is made possible by node embeddings created by a heterogeneous graph neural network that can capture inter- and intra-trajectory information. Furthermore, we propose a new regularization loss that encourages the output to respect the MTT premise of at most one incoming and one outgoing link for every node, favouring monophonic (voice) trajectories; this loss function might also be useful in other general MTT scenarios. Our approach does not use domain-specific heuristics, is scalable to longer sequences and a higher number of voices, and can handle complex cases such as voice inversions and overlaps. We reach new state-of-the-art results for the voice separation task in classical music of different styles.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2023

Music ControlNet: Multiple Time-varying Controls for Music Generation

Shih-Lun Wu, Chris Donahue, Shinji Watanabe et al.

Text-to-music generation models are now capable of generating high-quality music audio in broad styles. However, text control is primarily suitable for the manipulation of global musical attributes like genre, mood, and tempo, and is less suitable for precise control over time-varying attributes such as the positions of beats in time or the changing dynamics of the music. We propose Music ControlNet, a diffusion-based music generation model that offers multiple precise, time-varying controls over generated audio. To imbue text-to-music models with time-varying control, we propose an approach analogous to pixel-wise control of the image-domain ControlNet method. Specifically, we extract controls from training audio yielding paired data, and fine-tune a diffusion-based conditional generative model over audio spectrograms given melody, dynamics, and rhythm controls. While the image-domain Uni-ControlNet method already allows generation with any subset of controls, we devise a new strategy to allow creators to input controls that are only partially specified in time. We evaluate both on controls extracted from audio and controls we expect creators to provide, demonstrating that we can generate realistic music that corresponds to control inputs in both settings. While few comparable music generation models exist, we benchmark against MusicGen, a recent model that accepts text and melody input, and show that our model generates music that is 49% more faithful to input melodies despite having 35x fewer parameters, training on 11x less data, and enabling two additional forms of time-varying control. Sound examples can be found at https://MusicControlNet.github.io/web/.

en cs.SD, eess.AS

Detail Sumber

DOAJ Open Access 2022

Música incidental de Manuel M. Ponce para La verdad sospechosa de Juan Ruiz de Alarcón

Rodolfo Pérez Berrelleza

Este artículo presenta un avance del trabajo de investigación sobre la música incidental compuesta por Manuel M. Ponce en 1934 para La verdad sospechosa de Juan Ruiz de Alarcón, llevada a la escena en septiembre de 1934 con motivo de la inauguración del Palacio de Bellas Artes de México. La información que se muestra es resultado de la revisión de libros, tesis y artículos, así como la consulta de manuscritos en bibliotecas, museos y archivos de personas cercanas a herederos de Manuel M. Ponce. Esta recopilación de información busca reunificar los movimientos de dicha obra.

Music, Musical instruction and study

Detail DOI Sumber

DOAJ Open Access 2022

La pedagogia a spirale nei programmi di educazione musicale dei “collèges” francesi

Pietro Milli

In 2015, the réforme du collège was launched in France, introducing the principles of the so-called ‘spiral pedagogy’, first expounded by Jerome S. Bruner in 1960, into the school curriculum. In this contribution, the changes concerning the teaching of music education are addressed from a historical, and partly comparative, perspective. More specifically, the four main stages at which government programmes have been published (from 1920 to the present) are described, with particular attention to the 2015 programmes. The article then examines the question of the reception of Bruner’s thought in France (also in relation to the case of the Manhattanville Music Curriculum Program in the United States) and its practical application in music education.

Music and books on Music, Musical instruction and study

Detail DOI Sumber

Hasil untuk "Musical instruction and study"