arXiv Open Access 2023

Generative Disco: Text-to-Video Generation for Music Visualization

Vivian Liu Tao Long Nathan Raw Lydia Chilton

Lihat Sumber

Abstrak

Visuals can enhance our experience of music, owing to the way they can amplify the emotions and messages conveyed within it. However, creating music visualization is a complex, time-consuming, and resource-intensive process. We introduce Generative Disco, a generative AI system that helps generate music visualizations with large language models and text-to-video generation. The system helps users visualize music in intervals by finding prompts to describe the images that intervals start and end on and interpolating between them to the beat of the music. We introduce design patterns for improving these generated videos: transitions, which express shifts in color, time, subject, or style, and holds, which help focus the video on subjects. A study with professionals showed that transitions and holds were a highly expressive framework that enabled them to build coherent visual narratives. We conclude on the generalizability of these patterns and the potential of generated video for creative professionals.

Topik & Kata Kunci

cs.HC cs.AI

Penulis (4)

Vivian Liu

Tao Long

Nathan Raw

Lydia Chilton

Format Sitasi

APA MLA BibTeX

Liu, V., Long, T., Raw, N., Chilton, L. (2023). Generative Disco: Text-to-Video Generation for Music Visualization. https://arxiv.org/abs/2304.08551

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2023
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓