arXiv Open Access 2015

Deep Denoising Auto-encoder for Statistical Speech Synthesis

Zhenzhou Wu Shinji Takaki Junichi Yamagishi
Lihat Sumber

Abstrak

This paper proposes a deep denoising auto-encoder technique to extract better acoustic features for speech synthesis. The technique allows us to automatically extract low-dimensional features from high dimensional spectral features in a non-linear, data-driven, unsupervised way. We compared the new stochastic feature extractor with conventional mel-cepstral analysis in analysis-by-synthesis and text-to-speech experiments. Our results confirm that the proposed method increases the quality of synthetic speech in both experiments.

Topik & Kata Kunci

Penulis (3)

Z

Zhenzhou Wu

S

Shinji Takaki

J

Junichi Yamagishi

Format Sitasi

Wu, Z., Takaki, S., Yamagishi, J. (2015). Deep Denoising Auto-encoder for Statistical Speech Synthesis. https://arxiv.org/abs/1506.05268

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2015
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓