Text to Speech System for Lambani - A Zero Resource, Tribal Language of India
Abstrak
A Text to Speech (TTS) system empowers illiterate people by speaking, in their native language, the information available in electronic media. Speech data and corresponding transcript is essential for the development of a TTS system. Most tribal languages in India neither have script nor written literature. In this paper, we present our effort to build a TTS system for Lambani, a tribal language spoken by a group of nomadic people living in several regions of India. We generated a text corpus of about 3000 Lambani sentences, written in the script of Kannada language. These sentences were read by a literate Lambani speaker in a recording studio. The Lambani TTS system was developed by adapting the Nvidia’s Tacotron2 model, pretrained with English speech, following the transfer learning approach. We implemented two versions of the Lambani TTS system. The first version uses HiFi-GAN vocoder and the second uses the WaveGlow vocoder. We evaluated the two versions of the TTS system using both objective and subjective measures. Both Perceptual Evaluation of Speech Quality score and Mean Opinion Score were higher for the Lambani TTS system that used WaveGlow vocoder.
Topik & Kata Kunci
Penulis (4)
Ashwini Dasare
K. Deepak
Mahadeva Prasanna
K. Samudra Vijaya
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2022
- Bahasa
- en
- Total Sitasi
- 1×
- Sumber Database
- Semantic Scholar
- DOI
- 10.1109/O-COCOSDA202257103.2022.9997838
- Akses
- Open Access ✓