arXiv Open Access 2026

The Garbage Dataset (GD): A Multi-Class Image Benchmark for Automated Waste Segregation

Suman Kunwar
Lihat Sumber

Abstrak

This study introduces the Garbage Dataset (GD), a publicly available image dataset designed to advance automated waste segregation through machine learning and computer vision. It is a diverse dataset that covers 10 categories of common household waste: metal, glass, biological, paper, battery, trash, cardboard, shoes, clothes, and plastic. The dataset comprises 12,259 labeled images collected through multiple methods, including the DWaste mobile app and curated web sources. The methods included rigorous validation through checksums and outlier detection, analysis of class imbalance and visual separability through PCA/t-SNE, and assessment of background complexity using entropy and saliency measures. The dataset was benchmarked using state-of-the-art deep learning models (EfficientNetV2M, EfficientNetV2S, MobileNet, ResNet50, ResNet101) evaluated on performance metrics and operational carbon emissions. The results of the experiment indicate that EfficientNetV2S achieved the highest performance with a accuracy of 95.13% and an F1-score of 0.95 with moderate carbon cost. Analysis revealed inherent dataset characteristics including class imbalance, a skew toward high-outlier classes (plastic, cardboard, paper), and brightness variations that require consideration. The main conclusion is that GD provides a valuable real-world benchmark for waste classification research while highlighting important challenges such as class imbalance, background complexity, and environmental trade-offs in model selection that must be addressed for practical deployment. The dataset is publicly released to support further research in environmental sustainability applications.

Topik & Kata Kunci

Penulis (1)

S

Suman Kunwar

Format Sitasi

Kunwar, S. (2026). The Garbage Dataset (GD): A Multi-Class Image Benchmark for Automated Waste Segregation. https://arxiv.org/abs/2602.10500

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓