arXiv Open Access 2022

Read it to me: An emotionally aware Speech Narration Application

Rishibha Bansal

Lihat Sumber

Abstrak

In this work we try to perform emotional style transfer on audios. In particular, MelGAN-VC architecture is explored for various emotion-pair transfers. The generated audio is then classified using an LSTM-based emotion classifier for audio. We find that "sad" audio is generated well as compared to "happy" or "anger" as people have similar expressions of sadness.

Topik & Kata Kunci

cs.SD cs.CL cs.LG eess.AS

Penulis (1)

Rishibha Bansal

Format Sitasi

APA MLA BibTeX

Bansal, R. (2022). Read it to me: An emotionally aware Speech Narration Application. https://arxiv.org/abs/2209.02785

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2022
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓