arXiv Open Access 2025

SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts--Extended Version

Nghiem Thanh Pham Tung Kieu Duc-Manh Nguyen Son Ha Xuan Nghia Duong-Trung +1 lainnya

Lihat Sumber

Abstrak

Small Language Models (SLMs) offer computational efficiency and accessibility, yet a systematic evaluation of their performance and environmental impact remains lacking. We introduce SLM-Bench, the first benchmark specifically designed to assess SLMs across multiple dimensions, including accuracy, computational efficiency, and sustainability metrics. SLM-Bench evaluates 15 SLMs on 9 NLP tasks using 23 datasets spanning 14 domains. The evaluation is conducted on 4 hardware configurations, providing a rigorous comparison of their effectiveness. Unlike prior benchmarks, SLM-Bench quantifies 11 metrics across correctness, computation, and consumption, enabling a holistic assessment of efficiency trade-offs. Our evaluation considers controlled hardware conditions, ensuring fair comparisons across models. We develop an open-source benchmarking pipeline with standardized evaluation protocols to facilitate reproducibility and further research. Our findings highlight the diverse trade-offs among SLMs, where some models excel in accuracy while others achieve superior energy efficiency. SLM-Bench sets a new standard for SLM evaluation, bridging the gap between resource efficiency and real-world applicability.

Topik & Kata Kunci

cs.CL cs.CY cs.PF

Penulis (6)

Nghiem Thanh Pham

Tung Kieu

Duc-Manh Nguyen

Son Ha Xuan

Nghia Duong-Trung

Danh Le-Phuoc

Format Sitasi

APA MLA BibTeX

Pham, N.T., Kieu, T., Nguyen, D., Xuan, S.H., Duong-Trung, N., Le-Phuoc, D. (2025). SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts--Extended Version. https://arxiv.org/abs/2508.15478

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓