Semantic Scholar Open Access 2022 1 sitasi

Text to Speech System for Lambani - A Zero Resource, Tribal Language of India

Ashwini Dasare K. Deepak Mahadeva Prasanna K. Samudra Vijaya

Abstrak

A Text to Speech (TTS) system empowers illiterate people by speaking, in their native language, the information available in electronic media. Speech data and corresponding transcript is essential for the development of a TTS system. Most tribal languages in India neither have script nor written literature. In this paper, we present our effort to build a TTS system for Lambani, a tribal language spoken by a group of nomadic people living in several regions of India. We generated a text corpus of about 3000 Lambani sentences, written in the script of Kannada language. These sentences were read by a literate Lambani speaker in a recording studio. The Lambani TTS system was developed by adapting the Nvidia’s Tacotron2 model, pretrained with English speech, following the transfer learning approach. We implemented two versions of the Lambani TTS system. The first version uses HiFi-GAN vocoder and the second uses the WaveGlow vocoder. We evaluated the two versions of the TTS system using both objective and subjective measures. Both Perceptual Evaluation of Speech Quality score and Mean Opinion Score were higher for the Lambani TTS system that used WaveGlow vocoder.

Topik & Kata Kunci

Penulis (4)

A

Ashwini Dasare

K

K. Deepak

M

Mahadeva Prasanna

K

K. Samudra Vijaya

Format Sitasi

Dasare, A., Deepak, K., Prasanna, M., Vijaya, K.S. (2022). Text to Speech System for Lambani - A Zero Resource, Tribal Language of India. https://doi.org/10.1109/O-COCOSDA202257103.2022.9997838

Akses Cepat

Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Total Sitasi
Sumber Database
Semantic Scholar
DOI
10.1109/O-COCOSDA202257103.2022.9997838
Akses
Open Access ✓