Hasil untuk "Music"

Menampilkan 20 dari ~1059105 hasil · dari CrossRef, arXiv, DOAJ, Semantic Scholar

JSON API
DOAJ Open Access 2026
Audio-Lyrics Multimodal Fusion for Music Genre Clustering with Dynamic Modality Weighting

Yang Jinyu

In the field of music information retrieval, existing genre clustering approaches employed for music suggestion, automatic tagging, and content arrangement typically combine audio and lyrics with static weights, which neglects the reality that diverse genres depend on these two forms of data to varying extents. this paper put forward an audio-lyrics multimodal fusion system with variable modality weights for unsupervised music genre clustering, first, the paper separately drew out multi - level representations from lyrics and audio, then, utilizing indicators like the presence of instruments, energy, and feature quality, the paper applied heuristic guidelines to figure out a modality weight for each sample, making it possible for the fusion to be adaptable at the sample level, ablation researches on a simulated dataset demonstrated that the dynamic weighting technique functioned considerably better than static - weight combination and single - modality benchmarks in terms of clustering quality measures, further examination of weight distributions among clusters revealed that the dynamic weighting system could flexibly grasp genre - specific modality dependence and enhance the understandability of clustering results, to further verify the feature extraction and clustering process, the paper also carried out subsequent experiments on the real-world Marsyas GTZAN dataset.

Information technology
arXiv Open Access 2025
SAGE-Music: Low-Latency Symbolic Music Generation via Attribute-Specialized Key-Value Head Sharing

Jiaye Tan, Haonan Luo, Linfeng Song et al.

Low-latency symbolic music generation is essential for real-time improvisation and human-AI co-creation. Existing transformer-based models, however, face a trade-off between inference speed and musical quality. Traditional acceleration techniques such as embedding pooling significantly degrade quality, while recently proposed Byte Pair Encoding (BPE) methods - though effective on single-track piano data - suffer large performance drops in multi-track settings, as revealed by our analysis. We propose Attribute-Specialized Key-Value Head Sharing (AS-KVHS), adapted to music's structured symbolic representation, achieving about 30% inference speedup with only a negligible (about 0.4%) quality drop in objective evaluations and slight improvements in subjective listening tests. Our main contributions are (1) the first systematic study of BPE's generalizability in multi-track symbolic music, and (2) the introduction of AS-KVHS for low-latency symbolic music generation. Beyond these, we also release SAGE-Music, an open-source benchmark that matches or surpasses state-of-the-art models in generation quality.

en cs.SD, cs.AI
arXiv Open Access 2024
Audio Conditioning for Music Generation via Discrete Bottleneck Features

Simon Rouard, Yossi Adi, Jade Copet et al.

While most music generation models use textual or parametric conditioning (e.g. tempo, harmony, musical genre), we propose to condition a language model based music generation system with audio input. Our exploration involves two distinct strategies. The first strategy, termed textual inversion, leverages a pre-trained text-to-music model to map audio input to corresponding "pseudowords" in the textual embedding space. For the second model we train a music language model from scratch jointly with a text conditioner and a quantized audio feature extractor. At inference time, we can mix textual and audio conditioning and balance them thanks to a novel double classifier free guidance method. We conduct automatic and human studies that validates our approach. We will release the code and we provide music samples on https://musicgenstyle.github.io in order to show the quality of our model.

en cs.SD, eess.AS
arXiv Open Access 2024
Exploring Real-Time Music-to-Image Systems for Creative Inspiration in Music Creation

Meng Yang, Maria Teresa Llano, Jon McCormack

This paper presents a study on the use of a real-time music-to-image system as a mechanism to support and inspire musicians during their creative process. The system takes MIDI messages from a keyboard as input which are then interpreted and analysed using state-of-the-art generative AI models. Based on the perceived emotion and music structure, the system's interpretation is converted into visual imagery that is presented in real-time to musicians. We conducted a user study in which musicians improvised and composed using the system. Our findings show that most musicians found the generated images were a novel mechanism when playing, evidencing the potential of music-to-image systems to inspire and enhance their creative process.

en cs.HC
arXiv Open Access 2024
Emotion-aware Personalized Music Recommendation with a Heterogeneity-aware Deep Bayesian Network

Erkang Jing, Yezheng Liu, Yidong Chai et al.

Music recommender systems play a critical role in music streaming platforms by providing users with music that they are likely to enjoy. Recent studies have shown that user emotions can influence users' preferences for music moods. However, existing emotion-aware music recommender systems (EMRSs) explicitly or implicitly assume that users' actual emotional states expressed through identical emotional words are homogeneous. They also assume that users' music mood preferences are homogeneous under the same emotional state. In this article, we propose four types of heterogeneity that an EMRS should account for: emotion heterogeneity across users, emotion heterogeneity within a user, music mood preference heterogeneity across users, and music mood preference heterogeneity within a user. We further propose a Heterogeneity-aware Deep Bayesian Network (HDBN) to model these assumptions. The HDBN mimics a user's decision process of choosing music with four components: personalized prior user emotion distribution modeling, posterior user emotion distribution modeling, user grouping, and Bayesian neural network-based music mood preference prediction. We constructed two datasets, called EmoMusicLJ and EmoMusicLJ-small, to validate our method. Extensive experiments demonstrate that our method significantly outperforms baseline approaches on metrics of HR, Precision, NDCG, and MRR. Ablation studies and case studies further validate the effectiveness of our HDBN. The source code and datasets are available at https://github.com/jingrk/HDBN.

en cs.AI
arXiv Open Access 2024
Degradation-Invariant Music Indexing

Rémi Mignot, Geoffroy Peeters

For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition of high-dimensional audio prints, which are intrinsically (by design) robust to some sound degradations. The high dimensionality of this first representation is then used to learn a linear projection to a sub-space significantly smaller, which reduces again the sensibility to sound degradations using a series of discriminant analyses. Finally, anchoring the analysis times on local maxima of a selected onset function, an approximative hashing is done to provide a better tolerance to bit corruptions, and in the same time to make easier the scaling of the method.

en eess.SP
arXiv Open Access 2024
Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach

Pedro Ramoneda, Vsevolod Eremenko, Alexandre D'Hooge et al.

Estimating music piece difficulty is important for organizing educational music collections. This process could be partially automatized to facilitate the educator's role. Nevertheless, the decisions performed by prevalent deep-learning models are hardly understandable, which may impair the acceptance of such a technology in music education curricula. Our work employs explainable descriptors for difficulty estimation in symbolic music representations. Furthermore, through a novel parameter-efficient white-box model, we outperform previous efforts while delivering interpretable results. These comprehensible outcomes emulate the functionality of a rubric, a tool widely used in music education. Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7, showing precise difficulty estimation. Through our baseline, we illustrate how building on top of past research can offer alternatives for music difficulty assessment which are explainable and interpretable. With this, we aim to promote a more effective communication between the Music Information Retrieval (MIR) community and the music education one.

en cs.SD, cs.AI
arXiv Open Access 2023
A Long-Tail Friendly Representation Framework for Artist and Music Similarity

Haoran Xiang, Junyu Dai, Xuchen Song et al.

The investigation of the similarity between artists and music is crucial in music retrieval and recommendation, and addressing the challenge of the long-tail phenomenon is increasingly important. This paper proposes a Long-Tail Friendly Representation Framework (LTFRF) that utilizes neural networks to model the similarity relationship. Our approach integrates music, user, metadata, and relationship data into a unified metric learning framework, and employs a meta-consistency relationship as a regular term to introduce the Multi-Relationship Loss. Compared to the Graph Neural Network (GNN), our proposed framework improves the representation performance in long-tail scenarios, which are characterized by sparse relationships between artists and music. We conduct experiments and analysis on the AllMusic dataset, and the results demonstrate that our framework provides a favorable generalization of artist and music representation. Specifically, on similar artist/music recommendation tasks, the LTFRF outperforms the baseline by 9.69%/19.42% in Hit Ratio@10, and in long-tail cases, the framework achieves 11.05%/14.14% higher than the baseline in Consistent@10.

en cs.SD, cs.IR
DOAJ Open Access 2023
Skulle Norges musikkhøgskole ha ligget i Bergen?

Ida Børresen

I 1913 foreslo fiolinisten Torgrim Castberg å etablere «Norges Musikhøiskole» i Bergen. Castberg hadde med stor suksess etablert Musikakademiet for barn og unge samme sted i 1905. I 1913 fikk Castberg reist en stor bygning sentralt i Bergen. Rett etter innflyttingen av Musikakademiet i Musikens Hus lanserte han ideen om å etablere «Norges Musikhøiskole» i den ene fløyen i huset, i tett pedagogisk og administrativt samvirke med Musikakademiet. Forslaget fikk bred tilslutning, men direktøren ved Musik-konservatoriet i Kristiania, Peter Lindeman, var sterkt imot planene, og departementet fant forslaget for uklart. I denne artikkelen dokumenterer forfatteren denne tidlige debatten om musikkutdanningen i Norge og ikke minst hvordan den raskt kom til å dreie seg mest om Bergen og Kristiania og mindre om behovet for en norsk musikkhøgskole. Avslutningsvis trekkes linjene frem til etableringen av Norges Musikkhøgskole i Oslo i 1973, igjen dreide debatten seg aller mest om stedsvalg.

DOAJ Open Access 2023
Pour une harmonie de raison et de perception

Clément Personnic

Ludo-narrative dissonance is one of the most important (video)game studies concept at our disposal. Being both a tool that crystallizes old epistemological debates and a practical way to talk about gaming experience and structure, the notion remains blurred, lost between the different meanings of “ludo-narrativity”. The quick acceptance of it is the consequence of a super-applicability, meaning a practical and instinctive utility in numerous fields of research, that must put in perspective because how insufficiently it was initially circumscribed. Using a theoretical and historiographical approach, this article offers to put into light this insufficiency and how it highlights a reason and perception issue to remedy. We will show how it generates a response from researchers by studies offering – despite some limits – new ways of thinking about it; including the recourse to tonal harmony theory.

Recreation. Leisure
DOAJ Open Access 2023
Decentering the Colonial in Ghaouti Bouali’s Kashf al-qinā ‘of 1904

Jonathan GLASSER

Ghaouti Bouali's 1904 treatise Kitāb kashf al-qinā‘ ‘an ālāt al-samā‘ was a groundbreaking work of scholarship on Algerian and Maghribi music, poetry, and language. However, both in its own time and since, readers have treated it as a marginal curiosity that was remote from both wider Arab and European musicological scholarship. In contrast, the following pages treat Kashf al-qinā‘ as a cosmopolitan work that emerges from three overlapping contexts: one that is primarily French colonial, one that is primarily Arab, and one that is more specifically Maghribi. By situating Kashf al-qinā‘ in these contexts, and by specifically linking these contextualizations to the structure of the text itself, the treatment offered here shows Bouali’s book to be simultaneously less anomalous and more remarkable than it might appear at first glance.

Language and Literature
arXiv Open Access 2022
Modelling the perception of music in brain network dynamics

Jakub Sawicki, Lenz Hartmann, Rolf Bader et al.

We analyze the influence of music in a network of FitzHugh-Nagumo oscillators with empirical structural connectivity measured in healthy human subjects. We report an increase of coherence between the global dynamics in our network and the input signal induced by a specific music song. We show that the level of coherence depends crucially on the frequency band. We compare our results with experimental data, which also describe global neural synchronization between different brain regions in the gamma-band range and its increase just before transitions between different parts of the musical form (musical high-level events). The results also suggest a separation in musical form-related brain synchronization between high brain frequencies, associated with neocortical activity, and low frequencies in the range of dance movements, associated with interactivity between cortical and subcortical regions.

en nlin.AO, q-bio.NC
arXiv Open Access 2022
Video-Music Retrieval:A Dual-Path Cross-Modal Network

Xin Gu, Yinghua Shen, Chaohui Lv

We propose a method to recommend background music for videos. Current work rarely considers the emotional information of music, which is essential for video music retrieval. To achieve this, we design two paths to process content information and emotional information between modal. Based on characteristics of video and music, we design various feature extraction schemes and common representation spaces. More importantly, we propose a way to combine content information with emotional information. Additionally, we make improvements to the classical metric loss to be more suited to this task. Experiments show that this dual path video music retrieval network can effectively merge information. Compare with existing methods, the retrieval task evaluation index: increasing Recall@1 by 3.94 and Recall@25 by 16.36.

en cs.MM
DOAJ Open Access 2022
Kahoot!, Plickers And Socrative: ICT resources to assess musical content in Primary Education

Narciso José López García

Assessing musical competencies is a complicated process. However, the implementation of ICT in the classroom has facilitated this task, helping the Music education teacher to collect data in real time to describe the level of musical content of the students and information to fill in the required official documents. In this study, different evaluation strategies in music education are analyzed, as well as the main characteristics of three digital platforms, Kahoot!, Plickers and Socrative, whose main potential lies in the generation of gamified evaluation tests and evaluation and qualification reports of the students required by the educational administrations. The work method has been the analysis of information, which has served to make an exhaustive description of these platforms with the aim of providing the teachers information and tools that enable them to use and implement ICT in the evaluation of musical content of the Primary Education curriculum. Finally, the conclusions related to the benefits and limitations of these digital tools when applied in the Music classroom are presented.

Education, Special aspects of education
DOAJ Open Access 2022
Perspectivas da pesquisa (auto)biográfica para a educação musical:

Jéssica de Almeida

Este artigo ensaístico tem o objetivo de refletir sobre perspectivas da pesquisa (auto)biográfica atravessadas por estudos por mim realizados nos últimos anos, através de um exercício metanarrativo. Nesse esforço, realizo contrapontos entre produções (auto)biográficas, seus fundamentos teórico-metodológicos e recentes ponderações epistemológicas possibilitadas pelo Xxxxx. Especificamente, percorro alguns fundamentos da pesquisa-formação e da biografia educativa, bem como, percepções advindas de pesquisas realizadas com essas bases e de leituras recentes que, além de atualizarem a compreensão sobre formação e pesquisa científica, fundamentam um campo epistemológico potente para pensarmos, também, sentidos para a própria música. Com isso, almejo avançar na compreensão sobre o lugar da música na pesquisa (auto)biográfica e as potências desta última para pensarmos a formação humana com e através da música.

Music and books on Music
arXiv Open Access 2021
Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding

Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang

This paper proposes a new self-attention based model for music score infilling, i.e., to generate a polyphonic music sequence that fills in the gap between given past and future contexts. While existing approaches can only fill in a short segment with a fixed number of notes, or a fixed time span between the past and future contexts, our model can infill a variable number of notes (up to 128) for different time spans. We achieve so with three major technical contributions. First, we adapt XLNet, an autoregressive model originally proposed for unsupervised model pre-training, to music score infilling. Second, we propose a new, musically specialized positional encoding called relative bar encoding that better informs the model of notes' position within the past and future context. Third, to capitalize relative bar encoding, we perform look-ahead onset prediction to predict the onset of a note one time step before predicting the other attributes of the note. We compare our proposed model with two strong baselines and show that our model is superior in both objective and subjective analyses.

en cs.SD, cs.AI
DOAJ Open Access 2021
Psychological wellbeing early adult Korean pop fangirls

Angelita Jayalaksana Fitri, Fitri Yuli Maulidayanti, Sudaryat Nurdin Akhmad

Mental health is a state of well-being in which individuals are aware of their abilities, can cope with the normal stresses of daily life, can work productively or productively, and are able to contribute to their community. Individuals at the age from 20 to 40 years are called early adulthood. In the last two decades, South Korean popular culture has grown rapidly and expanded globally. Its existence which is accepted by the public from various circles has resulted in a phenomenon Korean Wave, one of which is the phenomenon in Korean Wave Korean pop music products or Korean-pop (K-pop). This study uses the study of literature (literature review) with data collection of articles, journals and books. The results showed that there was mental health (psychological wellbeing) in early adulthood for fangirls K-pop.  

Education, Special aspects of education
DOAJ Open Access 2021
Spiritual and Moral Upbringing Children in the Process of Musical and Theatrical Activities in Church Sunday School

Lyudmila A. Rapatskaya, Asya N. Rykova

In this article, the spiritual and moral upbringing and education in the Sunday schools of the Russian Orthodox Church is presented in a historical and cultural context. The authors believe that the main goal of modern Sunday schools in Russia is to revive the national tradition of missionary preaching the Gospel that is poorly studied in modern pedagogical science. This tradition determines the content and forms of educational work, including those related to art. To confirm their ideas, the authors analyze the main historical stages of the formation and development of church (parish) schools, which are the prototype of the modern Sunday schools of the Russian Orthodox Church. The history of the spiritual and moral formation of a child’s personality in such schools testifies to the need to regulate the substantive foundations of the educational process, which takes place in the space of traditions and attitudes of the Christian church, but inevitably includes the problems of the surrounding worldly reality and its cultural component. Based on the mission goal of the modern Church Sunday school, declared in the official guidelines of the Russian Orthodox Church, the authors proposed a culturological approach to the spiritual and moral upbringing and education by means of musical and theatrical activities. By its methodological nature this approach is integrative since it incorporates both secular and religious meanings. The authors took the theory of music-oriented polyartistic approach from the arsenal of secular pedagogy of music education. The dominant content of the educational process in the Orthodox Sunday school, that is associated with the preservation of the high Evangelic meaning of the spiritual and moral education of the younger generation, is drawn from the pedagogy of the religious direction. The article describes the experience of integrating music and theater arts in order to solve the problems of spiritual and moral education in the educational process of the modern Church Sunday schools of Rybinsk Diocese.

Halaman 48 dari 52956