arXiv Open Access 2024

Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning

Fang-Duo Tsai Shih-Lun Wu Haven Kim Bo-Yu Chen Hao-Chung Cheng +1 lainnya

Lihat Sumber

Abstrak

Text-to-music models allow users to generate nearly realistic musical audio with textual commands. However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. We utilize AudioMAE to extract features from the input audio, and construct attention-based adapters to feedthese features into the internal layers of AudioLDM2, a diffusion-based text-to-music model. With 22M trainable parameters, AP-Adapter empowers users to harness both global (e.g., genre and timbre) and local (e.g., melody) aspects of music, using the original audio and a short text as inputs. Through objective and subjective studies, we evaluate AP-Adapter on three tasks: timbre transfer, genre transfer, and accompaniment generation. Additionally, we demonstrate its effectiveness on out-of-domain audios containing unseen instruments during training.

Topik & Kata Kunci

cs.SD cs.AI eess.AS

Penulis (6)

Fang-Duo Tsai

Shih-Lun Wu

Haven Kim

Bo-Yu Chen

Hao-Chung Cheng

Yi-Hsuan Yang

Format Sitasi

APA MLA BibTeX

Tsai, F., Wu, S., Kim, H., Chen, B., Cheng, H., Yang, Y. (2024). Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning. https://arxiv.org/abs/2407.16564

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓