Enhancing obesity risk prediction using ensemble learning and explainable AI: a study on Saudi Health data
Abstrak
Background Early prediction of obesity risk is critical for timely intervention to prevent complications. While numerous studies have explored obesity classification, few have combined high predictive accuracy with transparent interpretability using explainable artificial intelligence (XAI). Objective This study develops an interpretable ensemble learning framework for obesity risk prediction. Methods Two ensemble-based machine learning frameworks were implemented: (i) a stacking ensemble integrating four heterogeneous base learners—Extreme Gradient Boosting (XGBoost), Linear Support Vector Machine (SVM), Bagging, and Gradient Boosting Classifier—with a LogisticRegressionCV meta-learner, and (ii) a soft voting ensemble combining XGBoost, SVM, and Bagging classifiers. Both frameworks incorporated a robust preprocessing pipeline comprising missing-value imputation, categorical encoding, feature scaling, and class rebalancing via Borderline Synthetic Minority Over-sampling Technique (Borderline-SMOTE). Model performance was evaluated on a Saudi Health dataset about ( n = 3,000) derived from the Arab Teens Lifestyle Study (ATLS), consisting of 19 behavioral, dietary, and anthropometric features. Baseline classifiers (Random Forest, AdaBoost, Bagging, and Light Gradient-Boosting Machine (LightGBM)) were optimized via Optuna for fair comparison. Model interpretability was achieved using Local Interpretable Model-Agnostic Explanations (LIME), providing both global and local insights into feature contributions. Results The soft voting ensemble attained a test accuracy of 97.81% with corresponding weighted metrics of 0.9795, 0.9781, and 0.9780, respectively. And The stacking ensemble achieved an independent test accuracy of 97.64% with weighted precision, recall, and F1-score of 0.9779, 0.9764, and 0.9763, respectively. Both ensembles demonstrated excellent generalization with minimal validation–test performance gaps, confirming their robustness and reliability. Conclusion The proposed explainable ensemble frameworks achieved high predictive accuracy, interpretability, providing clinically relevant foundation for applying ensemble learning with XAI in behavioral health modeling.
Penulis (4)
Norah S. Alsulami
Muhammad Sher Ramzan
Bander Alzahrani
Salhah Alsulami
Akses Cepat
- Tahun Terbit
- 2026
- Bahasa
- en
- Sumber Database
- CrossRef
- DOI
- 10.7717/peerj-cs.3716
- Akses
- Open Access ✓