Elite Episode Replay Memory for Polyphonic Piano Fingering Estimation
Abstrak
Piano fingering estimation remains a complex problem due to the combinatorial nature of hand movements and no best solution for any situation. A recent model-free reinforcement learning framework for piano fingering modeled each monophonic piece as an environment and demonstrated that value-based methods outperform probability-based approaches. Building on their finding, this paper addresses the more complex polyphonic fingering problem by formulating it as an online model-free reinforcement learning task with a novel training strategy. Thus, we introduce a novel Elite Episode Replay (EER) method to improve learning efficiency by prioritizing high-quality episodes during training. This strategy accelerates early reward acquisition and improves convergence without sacrificing fingering quality. The proposed architecture produces multiple-action outputs for polyphonic settings and is trained using both elite-guided and uniform sampling. Experimental results show that the EER strategy reduces training time per step by 21% and speeds up convergence by 18% while preserving the difficulty level and result of the generated fingerings. An empirical study of elite memory size further highlights its impact on training performance in solving piano fingering estimation.
Topik & Kata Kunci
Penulis (2)
Ananda Phan Iman
Chang Wook Ahn
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.3390/math13152485
- Akses
- Open Access ✓