arXiv Open Access 2024

Foundation Models for Music: A Survey

Yinghao Ma Anders Øland Anton Ragni Bleiz MacSen Del Sette Charalampos Saitis +37 lainnya

Lihat Sumber

Abstrak

In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the significance of music in various industries and trace the evolution of AI in music. By delineating the modalities targeted by foundation models, we discover many of the music representations are underexplored in FM development. Then, emphasis is placed on the lack of versatility of previous methods on diverse music applications, along with the potential of FMs in music understanding, generation and medical application. By comprehensively exploring the details of the model pre-training paradigm, architectural choices, tokenisation, finetuning methodologies and controllability, we emphasise the important topics that should have been well explored, like instruction tuning and in-context learning, scaling law and emergent ability, as well as long-sequence modelling etc. A dedicated section presents insights into music agents, accompanied by a thorough analysis of datasets and evaluations essential for pre-training and downstream tasks. Finally, by underscoring the vital importance of ethical considerations, we advocate that following research on FM for music should focus more on such issues as interpretability, transparency, human responsibility, and copyright issues. The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm.

Topik & Kata Kunci

cs.SD cs.AI cs.CL cs.LG eess.AS

Penulis (42)

Yinghao Ma

Anders Øland

Anton Ragni

Bleiz MacSen Del Sette

Charalampos Saitis

Chris Donahue

Chenghua Lin

Christos Plachouras

Emmanouil Benetos

Elona Shatri

Fabio Morreale

Ge Zhang

György Fazekas

Gus Xia

Huan Zhang

Ilaria Manco

Jiawen Huang

Julien Guinot

Liwei Lin

Luca Marinelli

Max W. Y. Lam

Megha Sharma

Qiuqiang Kong

Roger B. Dannenberg

Ruibin Yuan

Shangda Wu

Shih-Lun Wu

Shuqi Dai

Shun Lei

Shiyin Kang

Simon Dixon

Wenhu Chen

Wenhao Huang

Xingjian Du

Xingwei Qu

Xu Tan

Yizhi Li

Zeyue Tian

Zhiyong Wu

Zhizheng Wu

Ziyang Ma

Ziyu Wang

Format Sitasi

APA MLA BibTeX

Ma, Y., Øland, A., Ragni, A., Sette, B.M.D., Saitis, C., Donahue, C. et al. (2024). Foundation Models for Music: A Survey. https://arxiv.org/abs/2408.14340

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓