arXiv Open Access 2015

Deep Denoising Auto-encoder for Statistical Speech Synthesis

Zhenzhou Wu Shinji Takaki Junichi Yamagishi

Lihat Sumber

Abstrak

This paper proposes a deep denoising auto-encoder technique to extract better acoustic features for speech synthesis. The technique allows us to automatically extract low-dimensional features from high dimensional spectral features in a non-linear, data-driven, unsupervised way. We compared the new stochastic feature extractor with conventional mel-cepstral analysis in analysis-by-synthesis and text-to-speech experiments. Our results confirm that the proposed method increases the quality of synthetic speech in both experiments.

Topik & Kata Kunci

cs.SD cs.LG

Penulis (3)

Zhenzhou Wu

Shinji Takaki

Junichi Yamagishi

Format Sitasi

APA MLA BibTeX

Wu, Z., Takaki, S., Yamagishi, J. (2015). Deep Denoising Auto-encoder for Statistical Speech Synthesis. https://arxiv.org/abs/1506.05268

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2015
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓