CrossRef Open Access 2026

Enhancing obesity risk prediction using ensemble learning and explainable AI: a study on Saudi Health data

Norah S. Alsulami Muhammad Sher Ramzan Bander Alzahrani Salhah Alsulami

Lihat Sumber DOI

Abstrak

Background Early prediction of obesity risk is critical for timely intervention to prevent complications. While numerous studies have explored obesity classification, few have combined high predictive accuracy with transparent interpretability using explainable artificial intelligence (XAI). Objective This study develops an interpretable ensemble learning framework for obesity risk prediction. Methods Two ensemble-based machine learning frameworks were implemented: (i) a stacking ensemble integrating four heterogeneous base learners—Extreme Gradient Boosting (XGBoost), Linear Support Vector Machine (SVM), Bagging, and Gradient Boosting Classifier—with a LogisticRegressionCV meta-learner, and (ii) a soft voting ensemble combining XGBoost, SVM, and Bagging classifiers. Both frameworks incorporated a robust preprocessing pipeline comprising missing-value imputation, categorical encoding, feature scaling, and class rebalancing via Borderline Synthetic Minority Over-sampling Technique (Borderline-SMOTE). Model performance was evaluated on a Saudi Health dataset about ( n = 3,000) derived from the Arab Teens Lifestyle Study (ATLS), consisting of 19 behavioral, dietary, and anthropometric features. Baseline classifiers (Random Forest, AdaBoost, Bagging, and Light Gradient-Boosting Machine (LightGBM)) were optimized via Optuna for fair comparison. Model interpretability was achieved using Local Interpretable Model-Agnostic Explanations (LIME), providing both global and local insights into feature contributions. Results The soft voting ensemble attained a test accuracy of 97.81% with corresponding weighted metrics of 0.9795, 0.9781, and 0.9780, respectively. And The stacking ensemble achieved an independent test accuracy of 97.64% with weighted precision, recall, and F1-score of 0.9779, 0.9764, and 0.9763, respectively. Both ensembles demonstrated excellent generalization with minimal validation–test performance gaps, confirming their robustness and reliability. Conclusion The proposed explainable ensemble frameworks achieved high predictive accuracy, interpretability, providing clinically relevant foundation for applying ensemble learning with XAI in behavioral health modeling.

Penulis (4)

Norah S. Alsulami

Muhammad Sher Ramzan

Bander Alzahrani

Salhah Alsulami

Format Sitasi

APA MLA BibTeX

Alsulami, N.S., Ramzan, M.S., Alzahrani, B., Alsulami, S. (2026). Enhancing obesity risk prediction using ensemble learning and explainable AI: a study on Saudi Health data. https://doi.org/10.7717/peerj-cs.3716

Akses Cepat

Lihat di Sumber doi.org/10.7717/peerj-cs.3716

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: CrossRef
DOI: 10.7717/peerj-cs.3716
Akses: Open Access ✓