Explainable machine learning integrating biochemical and metabolomic biomarkers with conventional clinical factors improves chronic kidney disease prediction and risk stratification
Abstrak
Abstract Background Chronic kidney disease (CKD) is a leading cause of morbidity and mortality worldwide, yet existing risk models have limited ability to identify individuals at high long-term risk. Whether integrating circulating biochemical and metabolomic biomarkers can improve CKD prediction and risk stratification remains unclear. Methods We included 233,589 UK Biobank participants without CKD at baseline. Biomarkers were screened using multiple feature selection strategies. Predictive performance and effect sizes were evaluated using Cox proportional hazards models. CatBoost and SHAP were applied to identify key predictors, derive interpretable binary thresholds, and construct a simplified biomarker risk score (BRS). Relative and absolute CKD risks were assessed across tertiles of the BRS. Model discrimination and calibration were evaluated in an England development cohort and a geographically independent validation cohort from Scotland and Wales. Results A combined biochemical–metabolomic signature (BioMet) showed good discrimination for incident CKD and CKD-related mortality and consistently outperformed conventional risk models in both cohorts. Key risk-elevating biomarkers included cystatin C, HbA1c, CRP, and urea, whereas higher eGFR, M-VLDL-CE, histidine, and IGF-1 were inversely associated with CKD risk. A SHAP-derived Top10 BRS (Top10BRS) effectively stratified individuals into distinct risk groups. Compared with the lowest tertile, participants in the highest tertile had a substantially higher risk of incident CKD (HR: 3.73) and CKD-related mortality (HR: 10.40). Discrimination improved after adding Top10BRS to conventional models, while calibration and prediction error remained stable. Similar patterns were observed in the validation cohort. Conclusion Integrating biochemical and metabolomic biomarkers with conventional clinical predictors improves long-term prediction and risk stratification for CKD. An interpretable SHAP-derived BRS enables robust identification of individuals at elevated risk and may support earlier risk assessment and personalized prevention strategies for CKD.
Topik & Kata Kunci
Penulis (9)
Jing Ma
Ruiyan Liu
Xin Feng
Xing Li
Jielin Huang
Lu Zhang
Jian Gao
Guifang Hu
Xiru Zhang
Format Sitasi
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.1186/s12882-026-04781-9
- Akses
- Open Access ✓