Foundation Models for Music: A Survey
Abstrak
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the significance of music in various industries and trace the evolution of AI in music. By delineating the modalities targeted by foundation models, we discover many of the music representations are underexplored in FM development. Then, emphasis is placed on the lack of versatility of previous methods on diverse music applications, along with the potential of FMs in music understanding, generation and medical application. By comprehensively exploring the details of the model pre-training paradigm, architectural choices, tokenisation, finetuning methodologies and controllability, we emphasise the important topics that should have been well explored, like instruction tuning and in-context learning, scaling law and emergent ability, as well as long-sequence modelling etc. A dedicated section presents insights into music agents, accompanied by a thorough analysis of datasets and evaluations essential for pre-training and downstream tasks. Finally, by underscoring the vital importance of ethical considerations, we advocate that following research on FM for music should focus more on such issues as interpretability, transparency, human responsibility, and copyright issues. The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm.
Penulis (42)
Yinghao Ma
Anders Øland
Anton Ragni
Bleiz MacSen Del Sette
Charalampos Saitis
Chris Donahue
Chenghua Lin
Christos Plachouras
Emmanouil Benetos
Elona Shatri
Fabio Morreale
Ge Zhang
György Fazekas
Gus Xia
Huan Zhang
Ilaria Manco
Jiawen Huang
Julien Guinot
Liwei Lin
Luca Marinelli
Max W. Y. Lam
Megha Sharma
Qiuqiang Kong
Roger B. Dannenberg
Ruibin Yuan
Shangda Wu
Shih-Lun Wu
Shuqi Dai
Shun Lei
Shiyin Kang
Simon Dixon
Wenhu Chen
Wenhao Huang
Xingjian Du
Xingwei Qu
Xu Tan
Yizhi Li
Zeyue Tian
Zhiyong Wu
Zhizheng Wu
Ziyang Ma
Ziyu Wang
Akses Cepat
- Tahun Terbit
- 2024
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓