Diálogos triangulares na abordagem Orff
Cassiano Lima da Silveira Santos
Este texto, de natureza ensaística e ancorado em uma revisão bibliográfica, objetiva pontuar os diálogos metodológicos e filosóficos entre a Abordagem Triangular, sistematizada por Ana Mae Barbosa, e a Orff-Schulwerk, proposta pedagógico-musical aqui assumida como eixo central de reflexão. A partir de concepções disseminadas, destacamos que os vértices do fazer artístico, contextualização histórica e leitura crítica de mundo pela arte encontram ressonâncias nas ideias de Carl Orff e Gunild Keetman, que se ampliam quando compreendidas à luz de uma Educação Musical ativa, crítica e situada culturalmente. Dessa forma, destacamos que as proposições mutuamente encontram terreno fértil em experiências estético-musicais que não se restringem aos processos de ensinar e aprender, mas se expandem na busca por compreender o que se ensina, por que se ensina e como esses saberes se articulam às dimensões sensíveis, críticas e culturais dos sujeitos aprendentes.
Musical instruction and study
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models
Atharva Mehta, Shivam Chauhan, Amirbek Djanibekov
et al.
The advent of Music-Language Models has greatly enhanced the automatic music generation capability of AI systems, but they are also limited in their coverage of the musical genres and cultures of the world. We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. We find that only 5.7% of the total hours of existing music datasets come from non-Western genres, which naturally leads to disparate performance of the models across genres. We then investigate the efficacy of Parameter-Efficient Fine-Tuning (PEFT) techniques in mitigating this bias. Our experiments with two popular models -- MusicGen and Mustango, for two underrepresented non-Western music traditions -- Hindustani Classical and Turkish Makam music, highlight the promises as well as the non-triviality of cross-genre adaptation of music through small datasets, implying the need for more equitable baseline music-language models that are designed for cross-cultural transfer learning.
Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque
Oscar Sainz, Naiara Perez, Julen Etxaniz
et al.
Instructing language models with user intent requires large instruction datasets, which are only available for a limited set of languages. In this paper, we explore alternatives to conventional instruction adaptation pipelines in low-resource scenarios. We assume a realistic scenario for low-resource languages, where only the following are available: corpora in the target language, existing open-weight multilingual base and instructed backbone LLMs, and synthetically generated instructions sampled from the instructed backbone. We present a comprehensive set of experiments for Basque that systematically study different combinations of these components evaluated on benchmarks and human preferences from 1,680 participants. Our conclusions show that target language corpora are essential, with synthetic instructions yielding robust models, and, most importantly, that using as backbone an instruction-tuned model outperforms using a base non-instructed model. Scaling up to Llama 3.1 Instruct 70B as backbone, our model comes near frontier models of much larger sizes for Basque, without using any Basque instructions. We release code, models, instruction datasets, and human preferences to support full reproducibility in future research on low-resource language adaptation. https://github.com/hitz-zentroa/latxa-instruct
Consonance in music -- the Pythagorean approach revisited
Jan Cichowlas, Paweł Dłotko, Marek Kuś
et al.
The Pythagorean school attributed consonance in music to simplicity of frequency ratios between musical tones. In the last two centuries, the consonance curves developed by Helmholtz, Plompt and Levelt shifted focus to psycho-acoustic considerations in perceiving consonances. The appearance of peaks of these curves at the ratios considered by the Pythagorean school, and which were a consequence of an attempt to understand the world by nice mathematical proportions, remained a curiosity. This paper addresses this curiosity, by describing a mathematical model of musical sound, along with a mathematical definition of consonance. First, we define pure, complex and mixed tones as mathematical models of musical sound. By a sequence of numerical experiments and analytic calculations, we show that continuous cosine similarity, abbreviated as cosim, applied to these models quantifies the elusive concept of consonance as a frequency ratio which gives a local maximum of the cosim function. We prove that these maxima occur at the ratios considered as consonant in classical music theory. Moreover, we provide a simple explanation why the number of musical intervals considered as consonant by musicians is finite, but has been increasing over the centuries. Specifically, our formulas show that the number of consonant intervals changes with the depth of the tone (the number of harmonics present).
Large Language Models' Internal Perception of Symbolic Music
Andrew Shin, Kunitake Kaneko
Large language models (LLMs) excel at modeling relationships between strings in natural language and have shown promise in extending to other symbolic domains like coding or mathematics. However, the extent to which they implicitly model symbolic music remains underexplored. This paper investigates how LLMs represent musical concepts by generating symbolic music data from textual prompts describing combinations of genres and styles, and evaluating their utility through recognition and generation tasks. We produce a dataset of LLM-generated MIDI files without relying on explicit musical training. We then train neural networks entirely on this LLM-generated MIDI dataset and perform genre and style classification as well as melody completion, benchmarking their performance against established models. Our results demonstrate that LLMs can infer rudimentary musical structures and temporal relationships from text, highlighting both their potential to implicitly encode musical patterns and their limitations due to a lack of explicit musical context, shedding light on their generative capabilities for symbolic music.
Detecting Musical Deepfakes
Nick Sunday
The proliferation of Text-to-Music (TTM) platforms has democratized music creation, enabling users to effortlessly generate high-quality compositions. However, this innovation also presents new challenges to musicians and the broader music industry. This study investigates the detection of AI-generated songs using the FakeMusicCaps dataset by classifying audio as either deepfake or human. To simulate real-world adversarial conditions, tempo stretching and pitch shifting were applied to the dataset. Mel spectrograms were generated from the modified audio, then used to train and evaluate a convolutional neural network. In addition to presenting technical results, this work explores the ethical and societal implications of TTM platforms, arguing that carefully designed detection systems are essential to both protecting artists and unlocking the positive potential of generative AI in music.
Segment Transformer: AI-Generated Music Detection via Music Structural Analysis
Yumin Kim, Seonghyeon Go
Audio and music generation systems have been remarkably developed in the music information retrieval (MIR) research field. The advancement of these technologies raises copyright concerns, as ownership and authorship of AI-generated music (AIGM) remain unclear. Also, it can be difficult to determine whether a piece was generated by AI or composed by humans clearly. To address these challenges, we aim to improve the accuracy of AIGM detection by analyzing the structural patterns of music segments. Specifically, to extract musical features from short audio clips, we integrated various pre-trained models, including self-supervised learning (SSL) models or an audio effect encoder, each within our suggested transformer-based framework. Furthermore, for long audio, we developed a segment transformer that divides music into segments and learns inter-segment relationships. We used the FakeMusicCaps and SONICS datasets, achieving high accuracy in both the short-audio and full-audio detection experiments. These findings suggest that integrating segment-level musical features into long-range temporal analysis can effectively enhance both the performance and robustness of AIGM detection systems.
“Hadī Māši Nāyḍa, Hadī Nahḍa”: música, identidad y heterodoxia en los umbrales de un paradigma
Ahmed Balghzal
El trabajo pretende determinar el papel que puede desempeñar la música juvenil marroquí en la construcción de un ideal identitario disidente. Partiendo del análisis del discurso musical de varios grupos contraculturales buscamos caracterizar el papel de sus formulaciones subversivas de nociones como la pertenencia y la identidad para crear fisuras en la construcción canónica de todo el paradigma cultural. Pretendemos llegar a la conclusión de que tal labor marginal y erosiva abre expectativas de cambio a todo el modelo cultural marroquí marcado por patologías y desajustes crónicos que impiden su regeneración y su metamorfosis.
Music and books on Music, Musical instruction and study
Reanimating Sacco and Vanzetti: Historicity in Armand Gatti’s Play Chant Public Devant Deux Chaises Électriques
Ece Yassıtepe Ayyıldız
Armand Gatti, who has been memorialized by many different cultural events in France this year for his 100th anniversary of birth, is one of the authentic political writers of 20th century French theater. He has also used many historical and social events as the subjects for his theater plays: the 68 Generation, the Vietnam War, the Guatemalan Civil War, Spain of Franco and many other events that have resonated not only in their own countries but all over the world. Apart from this, another subject, which Gatti did not witness personally but that his father always narrated to him, has a significant place. He operated the tragic story of the electric chair execution of two Italian workers, Sacco & Vanzetti, immigrated to America in the early 1900s, in his play Chant Public devant deux chaises électriques. In this play, Gatti built a bridge between the past, the present and the future by using various techniques in theater. At the same time, the real audience of the play experiences a play within a different play; some of the actors appear as spectators-actors. These spectators-actors on stage will identify with the real persons (and witnesses) in the Sacco & Vanzetti story by establishing a connection between their own stories, thus experiencing a kind of “catharsis” as seen in tragedies. The main question of this play, assumed to be staged simultaneously in five different cities, is “Will Sacco & Vanzetti really be executed again in this hall tonight?” Gatti’s play, reminiscent of Alain Decaux’s The Rosenbergs Must Not Die, alludes to the Rosenbergs, despite their execution preceding Sacco & Vanzetti. In this study, we will not only examine the innovations Gatti brought to theater with his play Chant public devant deux chaises électriques, but also indicate how historicity is handled in theater by the author.
Musical instruction and study, Arts in general
"Female Body" as a Biopolitical Concept and Nudity in Performance Art as a Dissident Attitude
Ali Ömür Ulusoy
Throughout history, the codings about the body plays a strategic role in the con struction of the individual. Therefore, Foucault constructs the body as a field of "biopolitical reality" by saying that the body is "a place of recording events". The body is suppressed through all tangible and intangible institutions such as religion, state and moral codes. Through the body, a dichotomic universe based on opposites such as beautiful-ugly, thin-fat is created. As a result, a concept within a concept, a field of oppression within oppression is created. On the other hand, through the panoptical situations created, the body under pressure is also kept under surveillance. When the problem is analysed in terms of the"femal ebody",the severity of oppression increases exponentially. Performance art, which emerged after the 1960s, is based on the rebellious subject as both an artistic stance and a form of social existence. The most unique thing that distinguishes it from all the other arts is that the artist themself canalso turn into aworkofart.Inthissense, fromthefirstexamplesofperformanceto the present day, the artist, with their own body, has been the performance itself or part of it both as an ontological entity, as an epistemological subject and as an aesthetic object. Therefore, the body can be positioned as the main production tool of perfor mance art. This study aims to find the point where biopolitics and performance art collide and to expose the relationship between "female body-nudity-protest" through a sample of works from the 1970s.
Musical instruction and study, Arts in general
La traviata: os desafios interpretativos de uma produção operística no Brasil do Século XXI
Luciano de Freitas Camargo, Rodolfo García Vázquez
Este artigo apresenta resultados aplicados da pesquisa de pós-doutorado realizada na área de performance musical, especificamente no campo da interpretação operística, no contexto de uma produção da ópera La traviata de Giuseppe Verdi realizada em agosto de 2023 na cidade de São Paulo. Este estudo abordará elementos relativos à regência de ópera e suas especificidades técnicas, bem como questões interpretativas de direção artística, incluindo a concepção geral do espetáculo e o trabalho conjunto com a equipe de criação, que inclui o diretor cênico, cenografia, figurinos e iluminação, discutindo de forma detalhada as decisões e opções interpretativas – musicais e cênicas – que envolvem a produção de uma ópera escrita no século XIX nos dias atuais.
Musical instruction and study
AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just "Sounds Great!"
Yi-Lin Jiang, Chia-Ho Hsiung, Yen-Tung Yeh
et al.
The rise of "bedroom producers" has democratized music creation, while challenging producers to objectively evaluate their work. To address this, we present AI TrackMate, an LLM-based music chatbot designed to provide constructive feedback on music productions. By combining LLMs' inherent musical knowledge with direct audio track analysis, AI TrackMate offers production-specific insights, distinguishing it from text-only approaches. Our framework integrates a Music Analysis Module, an LLM-Readable Music Report, and Music Production-Oriented Feedback Instruction, creating a plug-and-play, training-free system compatible with various LLMs and adaptable to future advancements. We demonstrate AI TrackMate's capabilities through an interactive web interface and present findings from a pilot study with a music producer. By bridging AI capabilities with the needs of independent producers, AI TrackMate offers on-demand analytical feedback, potentially supporting the creative process and skill development in music production. This system addresses the growing demand for objective self-assessment tools in the evolving landscape of independent music production.
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
Shirley Anugrah Hayati, Taehee Jung, Tristan Bodding-Long
et al.
Fine-tuning large language models (LLMs) with a collection of large and diverse instructions has improved the model's generalization to different tasks, even for unseen tasks. However, most existing instruction datasets include only single instructions, and they struggle to follow complex instructions composed of multiple subtasks. In this work, we propose a novel concept of compositional instructions called chain-of-instructions (CoI), where the output of one instruction becomes an input for the next like a chain. Unlike the conventional practice of solving single instruction tasks, our proposed method encourages a model to solve each subtask step by step until the final answer is reached. CoI-tuning (i.e., fine-tuning with CoI instructions) improves the model's ability to handle instructions composed of multiple subtasks as well as unseen composite tasks such as multilingual summarization. Overall, our study find that simple CoI tuning of existing instruction data can provide consistent generalization to solve more complex, unseen, and longer chains of instructions.
Eksplorasi Visual dan Koreografi dalam Film “Anima” pada Album Komposisi Musik Thom Yorke
Rini Utami
“Anima” merupakan karya visual yang menggugah pikiran dan menjadi perbincangan dunia musikal yang berdurasi 15 menit. Karya ini bermula dari musik album solo Thom Yorke yang dikemas dengan menyandingkan film dan menggunakan tari sebagai media visualnya. Thom Yorke menggandeng seorang sutradara Paul Thomas Anderson untuk membuat karya yang menggugah kecemasan kontemporer kolektif, menyatukan filosofi, musik, dan koreografi yang spektakuler. Penelitian ini bertujuan untuk menjelaskan dan menganalisis karya menggunakan metode deskriptif kualitatif dengan pendekatan semiotika Ferdinand De Saussure. Penelitian ini berfokus pada membaca tanda dalam visual album musik melalui perspektif tari dan di kaji menggunakan pendekatan ilmu linguistik. Melalui film pendek yang berjudul “Anima” pencipta ingin berbicara tentang kecemasan lewat visual-visual yang menarik dan menawarkan atmosfer baru dalam ranah seni media baru. Film ini seni lebih melebur, artinya batas-batas seni pertunjukan, seni rupa dan seni media rekam menjadi sangat tipis. Hasil penelitian berupa pemaknaan dari simbol-simbol gerak pada koreografinya seperti perasaan kecemasan yang mendalam tentang keadaan masyarakat saat ini dan mencoba untuk menyadarkan masyarakat.
Music, Musical instruction and study
Ler ou não ler: eis a questão?
Silvia Cordeiro Nassif
Este ensaio coloca em discussão o ensino e aprendizagem da notação musical convencional. Tomando como ponto de partida algumas posições de estudiosos da área sobre a importância ou não em se considerar a escrita musical um conhecimento necessário, procura aprofundar os argumentos apresentados. Com o objetivo de chegar a algumas conclusões, ainda que provisórias, o texto é conduzido através de duas questões centrais: 1- Qual a função efetiva da notação musical para a música e nos processos educativos? 2- Como a aprendizagem da escrita musical pode ser pensada do ponto de vista do desenvolvimento? Tomando como bases teóricas a filosofia da cultura do Circulo de Bakhtin e a psicologia histórico cultural de Vigotski, levanta e analisa alguns pontos que poderão servir de balizas para organizar o trabalho pedagógico. Entre as conclusões apresentadas, destacam-se: a noção de que a notação convencional cumpre um papel que vai além do registro dos sons e permite uma olhar analítico para a música; a constatação dos limites da escrita não apenas para músicas de tradição oral, mas para a própria música europeia de concerto; a importância das vivências práticas musicais nos processos de aquisição das notações pela criança; as diferentes formas de registro musical que acontecem no curso do desenvolvimento psíquico. As considerações finais apontam para o fato de que a defesa ou não do ensino da escrita só adquire sentido quando são analisadas as situações contextuais de ensino como um todo.
Musical instruction and study
The Arrow of Time in Music -- Revisiting the Temporal Structure of Music with Distinguishability and Unique Orientability as the Anchor Point
Qi Xu
Driven by the term "the arrow of time" as a general topic, the article develops a musical discussion by referring to the etymological origin of the term: philosophy (epistemology) and physics (thermodynamics). In particular, the article explores two specific conditions: distinguishability and unique orientability, from which the article derives respective musical propositions and case studies. For the distinguishability condition, the article focuses on the "recurrence" in music and tries to interpret Bach's Christmas Oratorio from the perspective of "birth/resurrection". For the unique orientability condition, the article discusses the process of delaying the climax, thereby proposing "AB-AAB left-replication" model, implying an organicist view by treating the temporal structure of music (e.g. form) as the product of a dynamic process: organic growth.
Musical creativity enabled by nonlinear oscillations of a bubble in water
Ivan S. Maksymov
Producing original and arranging existing musical outcomes is an art that takes years of learning and practice to master. Yet, despite the constant advances in the field of AI-powered musical creativity, production of quality musical outcomes remains a prerogative of the humans. Here we demonstrate that a single bubble in water can be used to produce creative musical outcomes, when it nonlinearly oscillates under an acoustic pressure signal that encodes a piece of classical music. The audio signal of the response of the bubble resembles an electric guitar version of the original composition. We suggest, and provide plausible theoretical supporting arguments, that this property of the bubble can be used to create physics-inspired AI systems capable of simulating human creativity in arrangement and composition of music.
StemGen: A music generation model that listens
Julian D. Parker, Janne Spijkervet, Katerina Kosta
et al.
End-to-end generation of musical audio using deep learning techniques has seen an explosion of activity recently. However, most models concentrate on generating fully mixed music in response to abstract conditioning information. In this work, we present an alternative paradigm for producing music generation models that can listen and respond to musical context. We describe how such a model can be constructed using a non-autoregressive, transformer-based model architecture and present a number of novel architectural and sampling improvements. We train the described architecture on both an open-source and a proprietary dataset. We evaluate the produced models using standard quality metrics and a new approach based on music information retrieval descriptors. The resulting model reaches the audio quality of state-of-the-art text-conditioned models, as well as exhibiting strong musical coherence with its context.
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models
Jiashu Xu, Mingyu Derek Ma, Fei Wang
et al.
We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance. Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions (~1000 tokens) and control model behavior through data poisoning, without even the need to modify data instances or labels themselves. Through such instruction attacks, the attacker can achieve over 90% attack success rate across four commonly used NLP datasets. As an empirical study on instruction attacks, we systematically evaluated unique perspectives of instruction attacks, such as poison transfer where poisoned models can transfer to 15 diverse generative datasets in a zero-shot manner; instruction transfer where attackers can directly apply poisoned instruction on many other datasets; and poison resistance to continual finetuning. Lastly, we show that RLHF and clean demonstrations might mitigate such backdoors to some degree. These findings highlight the need for more robust defenses against poisoning attacks in instruction-tuning models and underscore the importance of ensuring data quality in instruction crowdsourcing.
Human-centered design in acoustics education for undergraduate music majors.
Minsik Choi, M. Kapur
An acoustics course for undergraduate music majors should take advantage of the natural affinity between acoustic science and musical practice. In this study, current students and recent graduates of one university's music school were surveyed with the goal of assessing their unique needs in an acoustics curriculum. The results of the survey are reported, and several curriculum recommendations are provided based on the principles of human-centered design. In particular, the acoustics course can harness musicians' intuitive understanding of sound by incorporating musical instruments into classroom demonstrations. Also, acoustics instructors should strive to introduce students to acoustical software, which is also used in the music industry. Finally, the survey findings suggest that the contemporary shift toward active learning and technology-based instruction in acoustics pedagogy is beneficial to music students.