arXiv Open Access 2021

Golos: Russian Dataset for Speech Research

Nikolay Karpov Alexander Denisenko Fedor Minkin

Lihat Sumber

Abstrak

This paper introduces a novel Russian speech dataset called Golos, a large corpus suitable for speech research. The dataset mainly consists of recorded audio files manually annotated on the crowd-sourcing platform. The total duration of the audio is about 1240 hours. We have made the corpus freely available to download, along with the acoustic model with CTC loss prepared on this corpus. Additionally, transfer learning was applied to improve the performance of the acoustic model. In order to evaluate the quality of the dataset with the beam-search algorithm, we have built a 3-gram language model on the open Common Crawl dataset. The total word error rate (WER) metrics turned out to be about 3.3% and 11.5%.

Topik & Kata Kunci

eess.AS

Penulis (3)

Nikolay Karpov

Alexander Denisenko

Fedor Minkin

Format Sitasi

APA MLA BibTeX

Karpov, N., Denisenko, A., Minkin, F. (2021). Golos: Russian Dataset for Speech Research. https://arxiv.org/abs/2106.10161

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓