arXiv Open Access 2025

Accelerating Autoregressive Speech Synthesis Inference With Speech Speculative Decoding

Zijian Lin Yang Zhang Yougen Yuan Yuming Yan Jinjiang Liu +3 lainnya

Lihat Sumber

Abstrak

Modern autoregressive speech synthesis models leveraging language models have demonstrated remarkable performance. However, the sequential nature of next token prediction in these models leads to significant latency, hindering their deployment in scenarios where inference speed is critical. In this work, we propose Speech Speculative Decoding (SSD), a novel framework for autoregressive speech synthesis acceleration. Specifically, our method employs a lightweight draft model to generate candidate token sequences, which are subsequently verified in parallel by the target model using the proposed SSD framework. Experimental results demonstrate that SSD achieves a significant speedup of 1.4x compared with conventional autoregressive decoding, while maintaining high fidelity and naturalness. Subjective evaluations further validate the effectiveness of SSD in preserving the perceptual quality of the target model while accelerating inference.

Topik & Kata Kunci

cs.SD cs.AI eess.AS

Penulis (8)

Zijian Lin

Yang Zhang

Yougen Yuan

Yuming Yan

Jinjiang Liu

Zhiyong Wu

Pengfei Hu

Qun Yu

Format Sitasi

APA MLA BibTeX

Lin, Z., Zhang, Y., Yuan, Y., Yan, Y., Liu, J., Wu, Z. et al. (2025). Accelerating Autoregressive Speech Synthesis Inference With Speech Speculative Decoding. https://arxiv.org/abs/2505.15380

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓