C. Krumhansl
Hasil untuk "Music"
Menampilkan 20 dari ~609920 hasil · dari arXiv, DOAJ, Semantic Scholar
Eetu Tunturi, David Diaz-Guerra, Archontis Politis et al.
Music source separation is the task of separating a mixture of instruments into constituent tracks. Music source separation models are typically trained using only audio data, although additional information can be used to improve the model's separation capability. In this paper, we propose two ways of using musical scores to aid music source separation: a score-informed model where the score is concatenated with the magnitude spectrogram of the audio mixture as the input of the model, and a model where we use only the score to calculate the separation mask. We train our models on synthetic data in the SynthSOD dataset and evaluate our methods on the URMP and Aalto anechoic orchestra datasets, comprised of real recordings. The score-informed model improves separation results compared to a baseline approach, but struggles to generalize from synthetic to real data, whereas the score-only model shows a clear improvement in synthetic-to-real generalization.
Fathinah Izzati, Xinyue Li, Gus Xia
We propose Expotion (Facial Expression and Motion Control for Multimodal Music Generation), a generative model leveraging multimodal visual controls - specifically, human facial expressions and upper-body motion - as well as text prompts to produce expressive and temporally accurate music. We adopt parameter-efficient fine-tuning (PEFT) on the pretrained text-to-music generation model, enabling fine-grained adaptation to the multimodal controls using a small dataset. To ensure precise synchronization between video and music, we introduce a temporal smoothing strategy to align multiple modalities. Experiments demonstrate that integrating visual features alongside textual descriptions enhances the overall quality of generated music in terms of musicality, creativity, beat-tempo consistency, temporal alignment with the video, and text adherence, surpassing both proposed baselines and existing state-of-the-art video-to-music generation models. Additionally, we introduce a novel dataset consisting of 7 hours of synchronized video recordings capturing expressive facial and upper-body gestures aligned with corresponding music, providing significant potential for future research in multimodal and interactive music generation.
Dorien Herremans, Abhinaba Roy
Recent advances in generative AI for music have achieved remarkable fidelity and stylistic diversity, yet these systems often fail to align with nuanced human preferences due to the specific loss functions they use. This paper advocates for the systematic application of preference alignment techniques to music generation, addressing the fundamental gap between computational optimization and human musical appreciation. Drawing on recent breakthroughs including MusicRL's large-scale preference learning, multi-preference alignment frameworks like diffusion-based preference optimization in DiffRhythm+, and inference-time optimization techniques like Text2midi-InferAlign, we discuss how these techniques can address music's unique challenges: temporal coherence, harmonic consistency, and subjective quality assessment. We identify key research challenges including scalability to long-form compositions, reliability amongst others in preference modelling. Looking forward, we envision preference-aligned music generation enabling transformative applications in interactive composition tools and personalized music services. This work calls for sustained interdisciplinary research combining advances in machine learning, music-theory to create music AI systems that truly serve human creative and experiential needs.
Manvi Agarwal, Changhong Wang, Gaël Richard
Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.
Junda Wu, Zachary Novack, Amit Namburi et al.
Existing music captioning methods are limited to generating concise global descriptions of short music clips, which fail to capture fine-grained musical characteristics and time-aware musical changes. To address these limitations, we propose FUTGA, a model equipped with fined-grained music understanding capabilities through learning from generative augmentation with temporal compositions. We leverage existing music caption datasets and large language models (LLMs) to synthesize fine-grained music captions with structural descriptions and time boundaries for full-length songs. Augmented by the proposed synthetic dataset, FUTGA is enabled to identify the music's temporal changes at key transition points and their musical functions, as well as generate detailed descriptions for each music segment. We further introduce a full-length music caption dataset generated by FUTGA, as the augmentation of the MusicCaps and the Song Describer datasets. We evaluate the automatically generated captions on several downstream tasks, including music generation and retrieval. The experiments demonstrate the quality of the generated captions and the better performance in various downstream tasks achieved by the proposed music captioning approach. Our code and datasets can be found in \href{https://huggingface.co/JoshuaW1997/FUTGA}{\textcolor{blue}{https://huggingface.co/JoshuaW1997/FUTGA}}.
Teresa Włosowicz
The article analyses the lyrics of selected songs by Lila Downs, paying special attention to the rhetorical appeals of logos, ethos and pathos they contain. The results of the analysis reveal all three types of rhetorical appeals, as well as passages combining two or all three of them. Pathos appeals to emotion, often using specific linguistic means (hyperbole, irony), ethos evokes both historical figures and indigenous people as examples, and logos involves predominantly social criticism.
Isabelle Marc
Nastaran Doregiraei, alireza sayyad
Panopticon surveillance is a concept that was first used by Jeremy Bentham, an English philosopher of the 18th century, in designing a fundamentally new type of prison. Michel Foucault, a French historian and thinker, returns to this concept in the 20th century and explains it as the new surveillance mechanisms of modern societies. Foucault believed that the disciplinary power to be applied should have a permanent, exhaustive and omnipresent means of surveillance that remains invisible while making everything visible. As a result, the new strategy of this power is based on hierarchical surveillance. In fact, Foucault reads the Panopticon as a generalizable model and a plan for the future that explains power relations in the daily life of people on a very macro level. From his point of view, omniscient supervision in Bentham's plan provides a central metaphor to describe the mechanisms of disciplinary power in modern societies. As a result, he believed that modern societies can be called disciplinary societies.During recent decades, this concept has transcended the framework of social studies and entered the field of cinema studies. By reviewing the works of early cinema, film scholars have confirmed the inherent links and affinities between surveillance and medium of cinema and have paid attention to thematic and structural ways of using it in films. This article seeks to analyze the way Panopticon surveillance is represented in the film The Invisible Trap (1979), borrowing from the ideas of Michel Foucault. The Invisible Trap is a lesser-known film in Iran's pre-revolutionary cinema, which was made by the order of SAVAK security organization in 1979. Following the developments of the year of its production and the reluctance of its actors to talk about it, the film was forgotten and joined the list of abandoned films in the history of Iranian cinema. The film narrates the true story of identifying, monitoring and arresting one of the spies of the imperial army of the Pahlavi regime, named Ahmad Moqrabi, who was cooperating with KGB officers for many years. To this end, first, the surveillance mechanisms in Panopticon are examined, and then, in order to explain the concept of the Panopticon surveillance, Foucault's views are borrowed. Then, the concept of surveillance cinema is analyzed from the perspective of important theoreticians of this field and finally the selected film in Iranian cinema is analyzed. At the end, a conclusion is drawn about the application of Panopticon surveillance in The Invisible Trap (1979). The current research, which was carried out with a descriptive-analytical method and using library sources, raises the argument that through this work, SAVAK had tried to represent itself as a Panopticon institution that is able to secretly monitor and take care of all citizens in their daily affairs. From this point of view, the film takes powerful steps in the aesthetic drawing of Pahlavi's disciplinary character and displays a completely calculated disciplinary maneuver for its audience; Creating a utopian image of the city under domination.
Chun Chia Tai
Taiwanese Indigenous youths utilize social media to assert Indigeneity. However, while egalitarian technologies provide a platform for self-representation, fetishism and multiculturalism might misrepresent their Indigeneity. This study focuses on Ponay’s covers of Mando-pop songs on YouTube to reclaim Indigenous popular music history and challenge Han-centric aesthetics and heteronormativity.
Nikola Petrović
In May 2019, Nikola did his fieldwork in the Kozara-mountain and Potkozarje-area (northwestern Republic of Srpska/Bosnia and Herzegovina), researching the traditional folk dances, songs, music and customs from the Serbian population. During his research, he noticed and was told that many dances and songs were influenced by the Serbian and Croatian population from Croatia (Slavonija- and Banija-region). After he tried to conduct a further research on these influences, he was met with nationalistic and chauvinistic remarks against the Croatian and Muslim population in the area and the Republic of Croatia in general, even though the Serbian population was discussed. The research on that day was nearly cancelled from the side of the informants. This audio-exchange (including materials from the field) should give insights on the (political) mentality of the population after the Yugoslav wars and dangers of conducting research in this area, when talking about the three major ethnic groups.
Olga Yu. Antsyferova, Andrey A. Astvatsaturov, Irina V. Morozova et al.
On December 8, 2022, an international academic conference dedicated to the year 1922 as an important milestone in the history of American and European Modernism was held at the Russian State University of Humanities (Moscow). The conference aimed at the cultural reconstruction of 1922 and was organized by the Department of Comparative-Historical Literary Studies, Russian State University for the Humanities, and A.M. Gorky Institute of World Literature of the Russian Academy of Sciences. American literary history occupied a prominent place in the program of the conference. The plenary session was devoted to T.S. Eliot, whose poem The Waste Land was published in 1922. Olga Polovinkina (Russian State University for the Humanities, A.M. Gorky Institute of World Literature of the Russian Academy of Sciences) spoke about the importance of the aesthetics of the music hall for the strcture The Waste Land. Igor Shaitanov (Russian State University of Humanities, The Russian Presidential Academy of National Economy and Public Administration) drew a parallel between The Waste Land and Evegny Zamyatin's Alatyr’. Vassily Tolmatchoff (Lomonosov Moscow State University) suggested a new interpretation of the “Love Song of J. Alfred Prufrock”. A.A. Astvatsaturov (St. Petersburg State University) considered T.S. Eliot’s modernist work in comparison with the creative attitudes and self-fashioning of Henry Miller. Alexandra Zinovieva (Lomonosov Moscow State University) spoke about Countess Marie Louise Elisabeth Larisсh von Moennich, the heroine of T.S. Eliot’s The Waste Land, and her participation in German and Austrian cinematographic projects of the late 1910s — early 1920s. Olga Panova (A.M. Gorky Institute of World Literature of the Russian Academy of Sciences, Lomonosov Moscow State University) reconstructed the year 1922 in the history of the Harlem Renaissance. Irina Morozova (Russian State University of Humanities) presented the year 1922 as an important period in the history of American pharaohmania. Olga Antsyferova (St. Petersburg State University) analysed the book 1922: Literature, Culture, Politics (ed. by Jean-Michel Rabaté; Cambridge University Press, 2015) showing how the methodology of historical simultaneity works on the material of culture studies.
Tiara Putri Ramadhani, Dwi Desi Yayi Tarina
Tujuan penelitian ini untuk mengkaji perlindungan hak cipta lagu dan musik yang digunakan secara komersial oleh orang lain berdasarkan Undang-Undang Hak Cipta dan peran dari Lembaga Manajemen Kolektif Nasional terhadap penggunaan karya ciptaan bagi perlindungan hak cipta. Lagu dan musik ialah karya ciptaan yang mudah disalahgunakan secara ilegal, sehingga dalam hal ini perlindungan terhadap hak cipta sangat dibutuhkan. Seperti yang terjadi di Surabaya, tempat hiburan karaoke menggunakan karya ciptaan berupa fonogram tanpa ada izin dari pencipta lagu. Tujuan dari penelitian ini ialah untuk mengetahui perlindungan yang dapat dilakukan terhadap penggunaan karya ciptaan di tempat karaoke dan peran Lembaga Manajemen Kolektif Nasional (LMKN) dalam perlindungan karya cipta. Jenis penelitian bersifat yuridis normatif, dengan kajian pustaka dan pendekatan perundang-undangan yang digunakan dalam sumber penelitian. Hasil penelitian ini menunjukkan bahwa perlindungan karya cipta untuk penggunaan komersial diatur dalam Undang-Undang Hak Cipta Tahun 2014 dan Peraturan Pemerintah Nomor 56 Tahun 2021 tentang Pengelolaan Royalti Hak Cipta Lagu dan/atau Musik. Upaya perlindungan hak cipta dapat dilakukan dengan dua cara yaitu upaya pencegahan dan penindakan. Direktorat Jenderal Kekayaan Intelektual yang berada di bawah Kemenkumham telah membentuk suatu lembaga untuk melindungi dan menegakkan hukum yang mengatur penggunaan karya berhak cipta yang dibuat oleh pemerintah. lembaga non-APBN yang dibentuk adalah LMK (Lembaga Manajemen Kolektif) dan LMKN (Lembaga Manajemen Kolektif Nasional) sebagai lembaga yang diberdayakan untuk mengelola hak cipta. LMK dan LMKN bertanggung jawab untuk mengumpulkan dan mendistribusikan royalti. Dengan demikian, dalam perlindungan hak cipta, lembaga ini memainkan peran tertentu, karena berwenang untuk mengelola hak ekonomi pencipta. Kata kunci: Abstract The purpose of this study is to examine the copyright protection of songs and music that are used commercially by other people based on the Copyright Act and the role of Management Institutions. Songs and music are creations that are easily misused illegally, so in this case, copyright protection is needed. In Surabaya, karaoke entertainment venues use phonograms without the permission of the songwriters. The purpose of this research is to regulate the protection that can be carried out against the use of works of creation in karaoke venues and the role of the National Collective Management Institute (LMKN) in protecting copyrighted works. This type of research is normative juridical, with literature review and statutory approaches used in research sources. The results of this study indicate that the protection of copyrighted works for commercial use is regulated in the Copyright Law 2014 and Government Regulation Number 56 of 2021 concerning the Management of Song and/or Music Copyright Royalties. Efforts to protect copyright can be carried out in two method, prevention and enforcement efforts. The Directorate General of Intellectual Property is under the Ministry of Law and Human Rights has established an institution to protect and enforce laws governing the use of copyrighted works created by the government. The non-APBN institutions formed are the LMK (Collective Management Institute) and LMKN (National Collective Management Institute) as institutions empowered to manage copyrights. LMK and LMKN are responsible for collecting and distributing royalties. In copyright protection, this institution have a role, because it is authorized to administer the economic rights of the creator.
Runbang Zhang, Yixiao Zhang, Kai Shao et al.
In this study, we explore the representation mapping from the domain of visual arts to the domain of music, with which we can use visual arts as an effective handle to control music generation. Unlike most studies in multimodal representation learning that are purely data-driven, we adopt an analysis-by-synthesis approach that combines deep music representation learning with user studies. Such an approach enables us to discover \textit{interpretable} representation mapping without a huge amount of paired data. In particular, we discover that visual-to-music mapping has a nice property similar to equivariant. In other words, we can use various image transformations, say, changing brightness, changing contrast, style transfer, to control the corresponding transformations in the music domain. In addition, we released the Vis2Mus system as a controllable interface for symbolic music generation.
Chen Zhang, Yi Ren, Kejun Zhang et al.
While deep generative models have empowered music generation, it remains a challenging and under-explored problem to edit an existing musical piece at fine granularity. In this paper, we propose SDMuse, a unified Stochastic Differential Music editing and generation framework, which can not only compose a whole musical piece from scratch, but also modify existing musical pieces in many ways, such as combination, continuation, inpainting, and style transferring. The proposed SDMuse follows a two-stage pipeline to achieve music generation and editing on top of a hybrid representation including pianoroll and MIDI-event. In particular, SDMuse first generates/edits pianoroll by iteratively denoising through a stochastic differential equation (SDE) based on a diffusion model generative prior, and then refines the generated pianoroll and predicts MIDI-event tokens auto-regressively. We evaluate the generated music of our method on ailabs1k7 pop music dataset in terms of quality and controllability on various music editing and generation tasks. Experimental results demonstrate the effectiveness of our proposed stochastic differential music editing and generation process, as well as the hybrid representations.
M. Teresa López Castilla
Este artículo argumenta el uso del silbido musical como un espacio queer, pues problematiza los márgenes establecidos para una clasificación de género en base al timbre o tesitura como ocurre con la voz. Para entender la peculiaridad del silbido en relación al cuerpo que lo produce planteamos dos planos de escucha. Por un lado, una escucha acusmática del silbido nos ofrece posibilidades desubicadas en relación al cuerpo y género difíciles de categorizar en esos márgenes. Por otro lado, una escucha audiovisual nos cuestiona sobre las posibles paradojas y contradicciones corporo-sonoras que ponen en juego discursos culturales sexistas en relación al silbido musical. Utilizaremos un marco teórico basado en los estudios queer para analizar conceptos en relación a la voz —el grano de la voz, el cuerpo sónico, la voz safónica— y que pueden bien ser aplicables al estudio del silbido musical en torno a la construcción de la identidad, y el género. El objetivo será entender cómo el género y la sexualidad se inmiscuyen en la escucha y producción de la música, incluso cuando el sonido (timbre) con el que se construye el silbido ofrece posibilidades de fuga corporales y/o sexuales, o, por el contrario, puede acentuarlas y problematizarlas al encarnarse visualmente.
Joseph Mulholland
The poem is loosely inspired by Peter Doig’s painting “Music of the Future” and reimagines the night scene that is depicted in the painting. The poetic voice is rooted in a deep sense of place while simultaneously speaking from the outer edges of that place, creating a liminal space through poetic images and narrative.
Raphaëlle Costa de Beauregard
Unfaithfully Yours (Preston Sturges, 1948) is a complex film with an intricate play on the transfer of media characteristics among dissimilar media. In this article, the source of the theoretical approach is mainly Lars Elleström’s (2014); its presentation in Bruhn and Schirrmacher (2022) is also used as a reference. The case study focuses on the performance of two Overtures of operas and a tone poem for a concert. In the film, they are used both in the representation of the concert, and as film music, the main character being an orchestra conductor who eventually imagines secret revenge dramas and becomes the sound recorder of three fiction films. The four modalities of media interact in Sturges’s film with great variety, during the rehearsal of Rossini’s Overture before the concert, or during the concert as we share the conductor’s mind busy satisfying his secret obsessions, and even, after the concert, when the material and sensorial modalities of sounds appear in the screening of objects. The main point of the transfer of these media characteristics is how they interact in the spectator’s mind; moreover, they either arouse emotions that are shared with the characters, or laughter owing to burlesque effects thus created.
Rizki Arisandi, Tri Hartiti
Hipertensi merupakan kondisi tekanan darah sistolik sama atau lebih tinggi dari 140 mmHg dan tekanan darah diastolik lebih tinggi dari 90 mmHg . Salah satu pengobatan non-farmakologis yang dapat dilakukan adalah pemberian music klasik. Musik merupakan suatu stimulus yang unik yang dapat mempengaruhi respon fisik dan psikologis seseorang dalam pendengarannya serta merupakan suatu intervensi yang efektif untuk meningkatkan relaksasi fisiologis yaitu dengan penurunan nadi, respirasi, tekanan darah dan nyeri. efeknya menunjukkan bahwa musik dapat mempengaruhi ketegangan atau kondisi rileks pada diri seseorang karena dapat merangsang pengeluaran endorphinedan serotin,. Studi kasus ini bertujuan untuk mengetahui penurunan tekanan darah pada pasien hipertensi setelah dilakukan terapi relaksasi musik klasik. Penelitian ini menggunakan metode case strudy dengan 2 responden yang dipilih sesuai inkusi peneliti.. Hasil pengkajian menunjukan kedua subjek studi memiliki jenis kelamin yang sama subjek studi kasus 1 dan 2 perempuan subjek studi kasus 1 berumur 66 tahun dan subjek studi kasus 2 berumur 69 tahun. Kedua subjek memiliki riwayat hipertensi, subjek studi kasus 1 memiliki obat jalan dari resep dokter amlodipin 5mg , subjek studi kasus 2 mengkonsumsi obat toko jika merasa pusing dan lemas. Hasil studi kasus menunjukan penurunan tekanan darah setelah dilakukan terapi relaksasi musik klasik. Subjek studi kasus 1 dan 2 secara keseluruhan mengalami rata-rata penurunan tekanan darah sistolik 47 mmHg dan diastolik 27 mmHg. Terapi relaksasi musik klasik mampu menurunkan tekanan darah pada pasien hipertensi. Rencana tindakan lanjutan yang harus dilakukan yaitu dengan melakukan kontrol rutin ke pusat pelayanan kesehatan dengan teratur.
Andres Ferraro, Xavier Serra, Christine Bauer
Music streaming platforms are currently among the main sources of music consumption, and the embedded recommender systems significantly influence what the users consume. There is an increasing interest to ensure that those platforms and systems are fair. Yet, we first need to understand what fairness means in such a context. Although artists are the main content providers for music platforms, there is a research gap concerning the artists' perspective. To fill this gap, we conducted interviews with music artists to understand how they are affected by current platforms and what improvements they deem necessary. Using a Qualitative Content Analysis, we identify the aspects that the artists consider relevant for fair platforms. In this paper, we discuss the following aspects derived from the interviews: fragmented presentation, reaching an audience, transparency, influencing users' listening behavior, popularity bias, artists' repertoire size, quotas for local music, gender balance, and new music. For some topics, our findings do not indicate a clear direction about the best way how music platforms should act and function; for other topics, though, there is a clear consensus among our interviewees: for these, the artists have a clear idea of the actions that should be taken so that music platforms will be fair also for the artists.
Halaman 43 dari 30496