DOAJ Open Access 2026

Optimizing CNN-GRU Hybrid Ratios for Resource-Constrained Audio Classification: A Systematic Study From Parameter Efficiency to MCU Deployment

Rakshaa Munirathinam Stanislav Vitek

Abstrak

Urban sound classification has become a critical enabling technology for Internet of Things (IoT) applications, smart cities, and environmental monitoring systems. Despite advances in deep learning, deploying these models on resource-constrained microcontroller units (MCUs) poses significant challenges in achieving a balance between classification accuracy, computational efficiency, and energy consumption. This study introduces the first comprehensive and systematic analysis of hybrid CNN-GRU architectures optimized for embedded audio processing. Thirteen distinct layer ratio configurations were evaluated under varying temporal chunk sizes across three representative MCU platforms. Experiments conducted on the UrbanSound8K dataset reveal that CNN-heavy configurations achieve optimal performance for ultra-short duration processing, reaching 93.92% accuracy at 0.0625 seconds, while balanced architectures deliver the best trade-offs for medium-duration segments. On the ESC-50 dataset involving two different chunks, the experiments reveal mostly similar performance trends. With both datasets, the performance of the models peaks at about 3-seconds chunks. The study identifies eight critical application scenarios across safety, healthcare, industrial, and smart environment domains, establishing optimized architecture-platform combinations for each use case. Deployment results demonstrate that CNN-heavy models achieve more than 93% accuracy with sub-20ms inference times for ultra-short segments, enabling real-time tasks such as emergency vehicle and glass break detection. The proposed framework provides a quantitative foundation for selecting the most suitable architecture–platform pairings under specific application constraints, effectively bridging the gap between academic research and practical embedded audio classification systems.

Penulis (2)

R

Rakshaa Munirathinam

S

Stanislav Vitek

Format Sitasi

Munirathinam, R., Vitek, S. (2026). Optimizing CNN-GRU Hybrid Ratios for Resource-Constrained Audio Classification: A Systematic Study From Parameter Efficiency to MCU Deployment. https://doi.org/10.1109/ACCESS.2026.3665776

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1109/ACCESS.2026.3665776
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.1109/ACCESS.2026.3665776
Akses
Open Access ✓