arXiv
Open Access
2024
Augmenting Polish Automatic Speech Recognition System With Synthetic Data
Łukasz Bondaruk
Jakub Kubiak
Mateusz Czyżnikiewicz
Abstrak
This paper presents a system developed for submission to Poleval 2024, Task 3: Polish Automatic Speech Recognition Challenge. We describe Voicebox-based speech synthesis pipeline and utilize it to augment Conformer and Whisper speech recognition models with synthetic data. We show that addition of synthetic speech to training improves achieved results significantly. We also present final results achieved by our models in the competition.
Penulis (3)
Ł
Łukasz Bondaruk
J
Jakub Kubiak
M
Mateusz Czyżnikiewicz
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2024
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓