arXiv Open Access 2025

CapTune: Adapting Non-Speech Captions With Anchored Generative Models

Jeremy Zhengqi Huang Caluã de Lacerda Pataca Liang-Yuan Wu Dhruv Jain

Lihat Sumber

Abstrak

Non-speech captions are essential to the video experience of deaf and hard of hearing (DHH) viewers, yet conventional approaches often overlook the diversity of their preferences. We present CapTune, a system that enables customization of non-speech captions based on DHH viewers' needs while preserving creator intent. CapTune allows caption authors to define safe transformation spaces using concrete examples and empowers viewers to personalize captions across four dimensions: level of detail, expressiveness, sound representation method, and genre alignment. Evaluations with seven caption creators and twelve DHH participants showed that CapTune supported creators' creative control while enhancing viewers' emotional engagement with content. Our findings also reveal trade-offs between information richness and cognitive load, tensions between interpretive and descriptive representations of sound, and the context-dependent nature of caption preferences.

Topik & Kata Kunci

cs.HC

Penulis (4)

Jeremy Zhengqi Huang

Caluã de Lacerda Pataca

Liang-Yuan Wu

Dhruv Jain

Format Sitasi

APA MLA BibTeX

Huang, J.Z., Pataca, C.d.L., Wu, L., Jain, D. (2025). CapTune: Adapting Non-Speech Captions With Anchored Generative Models. https://arxiv.org/abs/2508.19971

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓