DOAJ Open Access 2025

POQ: Is There a Pareto-Optimal Quantization Strategy for Deep Neural Networks?

Floran De Putter Sherif Eissa Henk Corporaal

Abstrak

Efficient deployment of deep learning models on resource-constrained devices requires balancing accuracy with energy consumption and/or latency. Quantization is a proven method to achieve this balance by reducing the precision of neural network weights and activations. However, simply changing the precision does not enable direct iso-accuracy and iso-energy comparisons. To address this, we combine a realistic processor energy model with a network filter multiplier that scales the number of channels, thereby enabling such comparisons. This work presents a Pareto-Optimal Quantization (POQ) methodology aimed at mapping a neural network architecture to a specific hardware platform while systematically exploring the design space in between to identify the most effective quantization strategy. Our approach evaluates how different design choices impact the accuracy-energy trade-off. Using detailed energy modeling instead of proxy metrics, our results reveal that 8-bit integer (<monospace>int8</monospace>) quantization is Pareto-Optimal for MobileNetV2, providing up to <inline-formula> <tex-math notation="LaTeX">$2.8\times $ </tex-math></inline-formula> energy savings or 10% higher accuracy compared to 16-bit floating-point (<monospace>fp16</monospace>). Furthermore, employing high-precision residuals shifts the Pareto frontier, making 4-bit integer (<monospace>int4</monospace>) quantization optimal, achieving up to <inline-formula> <tex-math notation="LaTeX">$1.9\times $ </tex-math></inline-formula> additional energy reduction or 2% additional accuracy gains. Moreover, our findings emphasize the role of DRAM energy in certain model configurations and highlight the importance of precise energy modeling. These results reflect the application of our POQ methodology to the practical deployment of energy-efficient deep learning models on constrained hardware.

Topik & Kata Kunci

Electrical engineering. Electronics. Nuclear engineering

Penulis (3)

Floran De Putter

Sherif Eissa

Henk Corporaal

Format Sitasi

APA MLA BibTeX

Putter, F.D., Eissa, S., Corporaal, H. (2025). POQ: Is There a Pareto-Optimal Quantization Strategy for Deep Neural Networks?. https://doi.org/10.1109/ACCESS.2025.3567046

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.1109/ACCESS.2025.3567046

Informasi Jurnal

Tahun Terbit: 2025
Sumber Database: DOAJ
DOI: 10.1109/ACCESS.2025.3567046
Akses: Open Access ✓