Hasil "Music and books on Music"

arXiv Open Access 2026

Bangla Music Genre Classification Using Bidirectional LSTMS

Muntakimur Rahaman, Md Mahmudul Hoque, Md Mehedi Hassain

Bangla music is enrich in its own music cultures. Now a days music genre classification is very significant because of the exponential increase in available music, both in digital and physical formats. It is necessary to index them accordingly to facilitate improved retrieval. Automatically classifying Bangla music by genre is essential for efficiently locating specific pieces within a vast and diverse music library. Prevailing methods for genre classification predominantly employ conventional machine learning or deep learning approaches. This work introduces a novel music dataset comprising ten distinct genres of Bangla music. For the task of audio classification, we utilize a recurrent neural network (RNN) architecture. Specifically, a Long Short-Term Memory (LSTM) network is implemented to train the model and perform the classification. Feature extraction represents a foundational stage in audio data processing. This study utilizes Mel-Frequency Cepstral Coefficients (MFCCs) to transform raw audio waveforms into a compact and representative set of features. The proposed framework facilitates music genre classification by leveraging these extracted features. Experimental results demonstrate a classification accuracy of 78%, indicating the system's strong potential to enhance and streamline the organization of Bangla music genres.

en cs.SD, cs.LG

Detail Sumber

DOAJ Open Access 2025

Shostakovich and Khrapchenko: On the Problem of “The Artist and Power”

Tatiana I. Naumenko

The article is devoted to the history of the relationship between Dmitry Shostakovich and Mikhail Khrapchenko, Chairman of the All-Union Committee for Arts Affairs (VKDI) under the Council of People’s Commissars of the Soviet Union (1939–1948). This department was created in 1936 under the leadership of Platon Kerzhentsev. For Shostakovich, the initial period of interaction with VKDI turned out to be quite dramatic. The first major action of the VKDI was the publication of the article Muddle Instead of Music (Pravda, 28 January 1936), directed against Shostakovich’s opera Lady Macbeth of Mtsensk; this was shortly followed by another entitled Ballet Falsehood (Pravda, 6 February 1936) that made accusations against the ballet The Limpid Stream. However, the appointment of Khrapchenko to the post of chairman of the VKDI in 1939 radically changed the position of Shostakovich, whose support by the new head of the department would benefit him greatly in the years to come. The article reconstructs the entire period of communication between Shostakovich and Khrapchenko based on archival documents, memoirs, letters and periodical press materials. The composer repeatedly turned to Khrapchenko for help and invariably received it in both creative and everyday matters. In 1948, Khrapchenko, like many other artists, became a victim of the anti-formalist campaign. On Stalin’s orders, an audit was conducted of the financial costs of preparing the opera The Great Friendship by Vano Muradeli. Having been designated as responsible for the failure of the opera, Khrapchenko subsequently spent several years paying a large fine to the state. At the conference of Soviet music figures, which took place at the Central Committee of the Communist Party of the Soviet Union (Bolsheviks) from 11–13 January 1948 under the chairmanship of Andrei Zhdanov, many of those whom Khrapchenko had supported during his many years of work at the VKDI spoke out against him. The only one who spoke out in defence of Khrapchenko was Shostakovich. Until the end of his life, the composer maintained communication with Khrapchenko, who again held high positions in the 1960s and always responded to the composer’s requests when he could. Keywords: Dmitry Shostakovich, Michail Khrapchenko, Joseph Stalin, Andrey Zhdanov, All-Union Committee for Arts Affairs, the composer and power, Soviet music, symphony, anti-formalist campaign of 1948

Music

Detail DOI Sumber

DOAJ Open Access 2024

PBSCR: The Piano Bootleg Score Composer Recognition Dataset

Arhan Jain, Alec Bunn, Austin Pham et al.

This article motivates, describes, and presents the PBSCR dataset for studying composer recognition of classical piano music. Our goal was to design a dataset that facilitates large-scale research on composer recognition that is suitable for modern architectures and training practices. To achieve this goal, we utilize the abundance of sheet music images and rich metadata on IMSLP, use a previously proposed feature representation called a bootleg score to encode the location of noteheads relative to staff lines, and present the data in an extremely simple format (2-dimensional binary images) to encourage rapid exploration and iteration. The dataset itself contains 40,000 62×64 bootleg score images for a 9-class recognition task, 100,000 62×64 bootleg score images for a 100-class recognition task, and 29,310 unlabeled variable-length bootleg score images for pretraining. The labeled data is presented in a format that mirrors MNIST images in order to make it extremely easy to visualize, manipulate, and train models in an efficient manner. We include relevant information to connect each bootleg score image with its underlying raw sheet music image, and we scrape, organize, and compile metadata from IMSLP on all piano works to facilitate multimodal research and allow for convenient linking to other datasets. We release baseline results in a supervised and low-shot setting for future works to compare against, and we discuss open research questions that the PBSCR data is especially well suited to facilitate research on.

Information technology, Music

Detail DOI Sumber

arXiv Open Access 2024

Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription

Antonio Ríos-Vila, Jorge Calvo-Zaragoza, Thierry Paquet

State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date, primarily been carried out using monophonic transcription techniques to handle complex score layouts, such as polyphony, often by resorting to simplifications or specific adaptations. Despite their efficacy, these approaches imply challenges related to scalability and limitations. This paper presents the Sheet Music Transformer, the first end-to-end OMR model designed to transcribe complex musical scores without relying solely on monophonic strategies. Our model employs a Transformer-based image-to-sequence framework that predicts score transcriptions in a standard digital music encoding format from input images. Our model has been tested on two polyphonic music datasets and has proven capable of handling these intricate music structures effectively. The experimental outcomes not only indicate the competence of the model, but also show that it is better than the state-of-the-art methods, thus contributing to advancements in end-to-end OMR transcription.

en cs.CV, cs.SD

Detail Sumber

arXiv Open Access 2024

Multi-Source Music Generation with Latent Diffusion

Zhongweiyang Xu, Debottam Dutta, Yu-Lin Wei et al.

Most music generation models directly generate a single music mixture. To allow for more flexible and controllable generation, the Multi-Source Diffusion Model (MSDM) has been proposed to model music as a mixture of multiple instrumental sources (e.g. piano, drums, bass, and guitar). Its goal is to use one single diffusion model to generate mutually-coherent music sources, that are then mixed to form the music. Despite its capabilities, MSDM is unable to generate music with rich melodies and often generates empty sounds. Its waveform diffusion approach also introduces significant Gaussian noise artifacts that compromise audio quality. In response, we introduce a Multi-Source Latent Diffusion Model (MSLDM) that employs Variational Autoencoders (VAEs) to encode each instrumental source into a distinct latent representation. By training a VAE on all music sources, we efficiently capture each source's unique characteristics in a "source latent." The source latents are concatenated and our diffusion model learns this joint latent space. This approach significantly enhances the total and partial generation of music by leveraging the VAE's latent compression and noise-robustness. The compressed source latent also facilitates more efficient generation. Subjective listening tests and Frechet Audio Distance (FAD) scores confirm that our model outperforms MSDM, showcasing its practical and enhanced applicability in music generation systems. We also emphasize that modeling sources is more effective than direct music mixture modeling. Codes and models are available at https://github.com/XZWY/MSLDM. Demos are available at https://xzwy.github.io/MSLDMDemo/.

en eess.AS, cs.LG

Detail Sumber

arXiv Open Access 2024

Arabic Music Classification and Generation using Deep Learning

Mohamed Elshaarawy, Ashrakat Saeed, Mariam Sheta et al.

This paper proposes a machine learning approach for classifying classical and new Egyptian music by composer and generating new similar music. The proposed system utilizes a convolutional neural network (CNN) for classification and a CNN autoencoder for generation. The dataset used in this project consists of new and classical Egyptian music pieces composed by different composers. To classify the music by composer, each sample is normalized and transformed into a mel spectrogram. The CNN model is trained on the dataset using the mel spectrograms as input features and the composer labels as output classes. The model achieves 81.4\% accuracy in classifying the music by composer, demonstrating the effectiveness of the proposed approach. To generate new music similar to the original pieces, a CNN autoencoder is trained on a similar dataset. The model is trained to encode the mel spectrograms of the original pieces into a lower-dimensional latent space and then decode them back into the original mel spectrogram. The generated music is produced by sampling from the latent space and decoding the samples back into mel spectrograms, which are then transformed into audio. In conclusion, the proposed system provides a promising approach to classifying and generating classical Egyptian music, which can be applied in various musical applications, such as music recommendation systems, music production, and music education.

en cs.SD, cs.AI

Detail Sumber

DOAJ Open Access 2023

At the Crossroads between Opera and Oratorio: Biblical Themes and 19ᵗʰ Century French Opera

Noémi KARÁCSONY, Mădălina Dana RUCSANDA

The present study aims to investigate the delicate border between oratorios and operas based on biblical subjects, focusing on French music. A brief analysis of the period prior to the 19th century reveals that the oratorio was not one of the most popular genres in France, allowing the genesis of various genres inspired by biblical subjects. The oratorio enjoyed a brief revival in the 18th century, leading in the 19th century to the composition of biblical operas and the dawn of genres that are at the crossroads between opera and oratorio: mystére, drame sacré, or légende sacrée, represented by works of Jules Massenet. Although traditionally referred to as oratorios, these works allowed for staged representation, due to the dramatic qualities of the librettos, which in turn allowed for the musical discourse to depart from the sobriety of the oratorio and to include certain features that are characteristic for opera. Gradually, biblical subjects began to be tackled in operas as well, as will be seen in the works of Camille Saint-Saëns and Jules Massenet, the two composers on whose works the present study mainly focuses. At the same time, biblical themes can be related to musical orientalism, but also to the fin de siècle decadent aesthetic (through their exploration of such themes as the opposition between sacred and erotic love), serving as mirror for the political, social, and religious context of the Third Republic.

Music

Detail DOI Sumber

DOAJ Open Access 2023

Corrigendum to “Developing a Child and Adolescent Chorister Engagement Survey: Probing Perceptions of Early Collective Experiences and Outcomes”

Music, Psychology

Detail DOI Sumber

DOAJ Open Access 2023

Examining the Lyrical Content and Musical Features of a Crowd-Sourced, Australian Pandemic Playlist

Kaila C. Putter, Amanda E. Krause, Dianna Vidas et al.

A recent examination of charting popular music before and during the first six months of the COVID-19 pandemic indicated that popular music lyrics during turbulent socioeconomic conditions had more negatively valenced words, providing support for the Environmental Security Hypothesis. However, the use of chart data alone cannot speak to what individuals are listening to against the backdrop of COVID-19. The present mixed-methods case study examined a crowd-sourced playlist ( n = 55 songs) created by Australian residents during an extended lockdown in September–October 2021. Qualitative analysis of the lyrics demonstrated that the selected music expresses a closeness to others, references to the current situation (such as illness and staying at home), negative emotions (including confusion and fear), a positive outlook (expressing perseverance and a will to survive), and a changing sense of time. Quantitative analyses compared the “pandemic playlist” songs to charting songs during the first six months of the pandemic in 2020 and the same period in 2021 ( n = 28 and 26 songs, respectively) with regard to their musical features (using scraped Spotify API data) and lyrical content (using Diction). The findings indicated that the songs included in the “pandemic playlist” differed significantly from the charting songs in 2020 and 2021 by being higher in energy (relative to 2020 and 2021) and less acoustic (relative to 2021). Additionally, the lyrics of the “pandemic playlist” songs had significantly more positively valenced words. These differences suggest that people believed music selected in response to the pandemic ought to be upbeat and realistic (playlist suggestions), but popular songs were relatively pensive and reflected uncertainty and isolation (chart data). These findings broaden our understanding of music listening behaviors in response to societal stress.

Music, Psychology

Detail DOI Sumber

DOAJ Open Access 2023

Publicaciones sobre música dirigidas desde la Institución Milá y Fontanals de Investigación en Humanidades (IMF‑CSIC), Barcelona

Equipo Editorial

Music and books on Music, Music

Detail Sumber

arXiv Open Access 2023

Choir Transformer: Generating Polyphonic Music with Relative Attention on Transformer

Jiuyang Zhou, Hong Zhu, Xingping Wang

Polyphonic music generation is still a challenge direction due to its correct between generating melody and harmony. Most of the previous studies used RNN-based models. However, the RNN-based models are hard to establish the relationship between long-distance notes. In this paper, we propose a polyphonic music generation neural network named Choir Transformer[ https://github.com/Zjy0401/choir-transformer], with relative positional attention to better model the structure of music. We also proposed a music representation suitable for polyphonic music generation. The performance of Choir Transformer surpasses the previous state-of-the-art accuracy of 4.06%. We also measures the harmony metrics of polyphonic music. Experiments show that the harmony metrics are close to the music of Bach. In practical application, the generated melody and rhythm can be adjusted according to the specified input, with different styles of music like folk music or pop music and so on.

en eess.AS, cs.AI

Detail Sumber

arXiv Open Access 2023

Exploring how a Generative AI interprets music

Gabriela Barenboim, Luigi Del Debbio, Johannes Hirn et al.

We use Google's MusicVAE, a Variational Auto-Encoder with a 512-dimensional latent space to represent a few bars of music, and organize the latent dimensions according to their relevance in describing music. We find that, on average, most latent neurons remain silent when fed real music tracks: we call these "noise" neurons. The remaining few dozens of latent neurons that do fire are called "music neurons". We ask which neurons carry the musical information and what kind of musical information they encode, namely something that can be identified as pitch, rhythm or melody. We find that most of the information about pitch and rhythm is encoded in the first few music neurons: the neural network has thus constructed a couple of variables that non-linearly encode many human-defined variables used to describe pitch and rhythm. The concept of melody only seems to show up in independent neurons for longer sequences of music.

en cs.SD, cs.LG

Detail Sumber

arXiv Open Access 2023

Emotion-Aligned Contrastive Learning Between Images and Music

Shanti Stewart, Kleanthis Avramidis, Tiantian Feng et al.

Traditional music search engines rely on retrieval methods that match natural language queries with music metadata. There have been increasing efforts to expand retrieval methods to consider the audio characteristics of music itself, using queries of various modalities including text, video, and speech. While most approaches aim to match general music semantics to the input queries, only a few focus on affective qualities. In this work, we address the task of retrieving emotionally-relevant music from image queries by learning an affective alignment between images and music audio. Our approach focuses on learning an emotion-aligned joint embedding space between images and music. This embedding space is learned via emotion-supervised contrastive learning, using an adapted cross-modal version of the SupCon loss. We evaluate the joint embeddings through cross-modal retrieval tasks (image-to-music and music-to-image) based on emotion labels. Furthermore, we investigate the generalizability of the learned music embeddings via automatic music tagging. Our experiments show that the proposed approach successfully aligns images and music, and that the learned embedding space is effective for cross-modal retrieval applications.

en cs.MM, cs.SD

Detail Sumber

arXiv Open Access 2023

EmoGen: Eliminating Subjective Bias in Emotional Music Generation

Chenfei Kang, Peiling Lu, Botao Yu et al.

Music is used to convey emotions, and thus generating emotional music is important in automatic music generation. Previous work on emotional music generation directly uses annotated emotion labels as control signals, which suffers from subjective bias: different people may annotate different emotions on the same music, and one person may feel different emotions under different situations. Therefore, directly mapping emotion labels to music sequences in an end-to-end way would confuse the learning process and hinder the model from generating music with general emotions. In this paper, we propose EmoGen, an emotional music generation system that leverages a set of emotion-related music attributes as the bridge between emotion and music, and divides the generation into two stages: emotion-to-attribute mapping with supervised clustering, and attribute-to-music generation with self-supervised learning. Both stages are beneficial: in the first stage, the attribute values around the clustering center represent the general emotions of these samples, which help eliminate the impacts of the subjective bias of emotion labels; in the second stage, the generation is completely disentangled from emotion labels and thus free from the subjective bias. Both subjective and objective evaluations show that EmoGen outperforms previous methods on emotion control accuracy and music quality respectively, which demonstrate our superiority in generating emotional music. Music samples generated by EmoGen are available via this link:https://ai-muzic.github.io/emogen/, and the code is available at this link:https://github.com/microsoft/muzic/.

en cs.SD, cs.AI

Detail Sumber

arXiv Open Access 2022

Psychologically-Inspired Music Recommendation System

Danila Rozhevskii, Jie Zhu, Boyuan Zhao

In the last few years, automated recommendation systems have been a major focus in the music field, where companies such as Spotify, Amazon, and Apple are competing in the ability to generate the most personalized music suggestions for their users. One of the challenges developers still fail to tackle is taking into account the psychological and emotional aspects of the music. Our goal is to find a way to integrate users' personal traits and their current emotional state into a single music recommendation system with both collaborative and content-based filtering. We seek to relate the personality and the current emotional state of the listener to the audio features in order to build an emotion-aware MRS. We compare the results both quantitatively and qualitatively to the output of the traditional MRS based on the Spotify API data to understand if our advancements make a significant impact on the quality of music recommendations.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2022

Volume-Independent Music Matching by Frequency Spectrum Comparison

Anthony Lee

Often, I hear a piece of music and wonder what the name of the piece is. Indeed, there are applications such as Shazam app that provides music matching. However, the limitations of those apps are that the same piece performed by the same musician cannot be identified if it is not the same recording. Shazam identifies the recording of it, not the music. This is because Shazam matches the variation in volume, not the frequencies of the sound. This research attempts to match music the way humans understand it: by the frequency spectrum of music, not the volume variation. Essentially, the idea is to precompute the frequency spectrums of all the music in the database, then take the unknown piece and try to match its frequency spectrum against every segment of every music in the database. I did it by matching the frequency spectrum of the unknown piece to our database by sliding the window by 0.1 seconds and calculating the error by taking Absolute value, normalizing the audio, subtracting the normalized arrays, and taking the sum of absolute differences. The segment that shows the least error is considered the candidate for the match. The matching performance proved to be dependent on the complexity of the music. Matching simple music, such as single note pieces, was successful. However, more complex pieces, such as Chopins Ballade 4, were not successful, that is, the algorithm could not produce low error values in any of the music in the database. I suspect that it has to do with having too many notes: mismatches in the higher harmonics added up to a significant amount of errors, which swamps the calculations.

en cs.SD, eess.AS

Detail Sumber

arXiv Open Access 2022

Music Influence Modeling Based on Directed Network Model

Xuan Zhang, Tingdi Ren, Lihong Wang et al.

Studying the history of music may provide a glimpse into the development of human creativity as we examine the evolutionary and revolutionary trends in music and genres. First, a musical influence metric was created to construct a directed network of musical influence. Second, we examined the revolutions and development of musical genres, modeled the similarity, and explored similarities and influences within and between genres. Hierarchical cluster analysis and time series analysis of genres were used to explore the correlation between genres. Finally, Network Analysis, Semantic Analysis, and Random Forest Model are employed to find the revolutionaries. The above work was applied to Country music to sort out and analyze its evolution. In studying the connection between music and the social environment, time series analysis is used to determine the impact of social, political, or technological changes on music.

en stat.AP

Detail Sumber

DOAJ Open Access 2021

«Føreren kaller!»

Ingrid Skovdahl

Sammendrag Komponisten Signe Lunds (1868–1950) historie er preget av kontroverser og paradokser. Til tross for en rekke framganger og et oppsiktsvekkende engasjement som kvinnesaksforkjemper og sosialist under det tidlige 1900-tallet er hun i dag nok mest kjent for å ha vært medlem av Nasjonal Samling både før og under okkupasjonsårene. Likevel er det mye som er ukjent når det gjelder hennes faktiske holdninger og handlinger under denne perioden. En av Signe Lunds mest kjente komposisjoner i så henseende er stykket «Føreren kaller!». Stykket nevnes og beskrives i flere forskjellige artikler om Signe Lund, ofte som det eneste eksemplet på musikk fra hennes verkkatalog. Redegjørelsene for stykket – hva det er, når det ble skrevet, og i hvilken anledning – skiller seg imidlertid markant fra tekst til tekst. Dessuten er det ingen av artiklene som refererer til selve hovedkilden. I denne artikkelen presenterer jeg funnet av «Føreren kaller!» og beskriver verkets innhold og kontekst. Videre diskuterer jeg hvordan funnet endrer musikkhistoriografien om Signe Lund, og om musikken under okkupasjonen.

Music

Detail DOI Sumber

DOAJ Open Access 2021

Musical and Pedagogical Views on the Emperor Nikolai Pavlovich Romanov’ Emotional and Value Attitude to the Music

Evgenia А. Babak

Based on the study of a wide range of historical sources, the article describes fifteenth emperor of the Romanov dynasty Nicholas I’ emotional and value attitude to the musical art. At the same time, two facets of his life path are taken into account – as a member of the imperial family and as an emperor, who has been entrusted with the management of the state since his accession to the throne in 1825. The evolution of his attitude to music and making music is traced at two distinct stages: from themoment of birth to the coronation and during the years of his rule of the country. It is noted that at the first stage, the formation of the Grand Duke’s musical preferences was influenced by themusical atmosphere in which his childhood passed; his innate musicality and early preferences in music itself and in certain types of musical activity. A distinctive feature of the second stage is the introduction by Nicholas I of compulsory music education for his children, as a result of the evolution of his emotional and value attitude tothe musical art. The contribution of the emperor to the development of Orthodox and secular music is emphasized.

Music

Detail DOI Sumber

DOAJ Open Access 2021

Clinical Training in Music Therapy

Edward A. Roth, Xueyan Hua, Wang Lu et al.

Objective: This paper examines the experiences of music therapy students throughout their clinical training. Three surveys inquired about: 1) the perception from both interns and supervisors as to interns’ needs, 2) interns’ preparedness, their skills, their priorities when choosing an internship, and whether their expectations for training were met (with comparisons between American and International respondents), and 3) satisfaction with clinical training. Method: Three separate surveys were distributed. The first survey’s respondents included pre-interns ( n = 19) and internship supervisors (n = 14) who had completed their training in the Great Lakes Region of the United States. The second survey’s respondents included American interns (n = 50), American professionals (n = 353), International interns (n = 12), and International professionals (n = 50). Respondents for the third survey included professional music therapists who completed their curriculum in the United States and held the MT-BC professional credential (N = 777). Results: Some differences between interns’ and supervisors’ perceptions of the interns’ needs were found in Survey 1; significant differences were found between the preparedness and strengths/weaknesses between groups in Survey 2; and Survey 3 found general satisfaction with training with some areas respondents felt needed improvement. Conclusions: While there is overall satisfaction with training for music therapists, there are inconsistencies in students’ experiences in, and perceptions of, their training.

Music, Psychology

Detail DOI Sumber

Hasil untuk "Music and books on Music"