DOAJ Open Access 2026

A machine learning model for mortality risk prediction of sepsis patients based on the medical information mart for intensive care III database

Yidi Shao Kangjun Wang Yu Ma

Abstrak

Sepsis poses a serious threat to patient survival, making timely risk assessment crucial. Predicting in-hospital mortality based on clinical indicators can aid in making better clinical decisions. Previous studies have focused on classifier selection but lacked a comprehensive analysis of feature selection and data preprocessing. This study optimized machine learning models for sepsis mortality prediction by: (1) comprehensively comparing feature selection and classification methods to identify the best combination, (2) building a high-performing model with fewer features, and (3) identifying key clinically relevant indicators.Methods: Using the MIMIC-III sepsis cohort, we conducted a comprehensive analysis to determine the optimal model, including data preprocessing, data balance, classifier selection, and feature selection. Feature importance was further analyzed to identify the key predictors of in-hospital mortality.Results: The proposed Synthetic Minority Oversampling Technique-Random Forest Recursive Feature Elimination-Extreme Gradient Boosting (SMOTE-(RF-RFE)-XGB) model achieved high predictive performance with a mean Area Under the Curve (AUC) of 0.8507, while reducing the number of features from 78 to 39. Compared to other feature selection methods evaluated in this study and those reported in related literature, Random Forest Recursive Feature Elimination (RF-RFE) offers the best trade-off between accuracy, feature compactness, and stability. Additionally, feature importance rankings consistently identified Acute Physiology Score III (APS III), Ventilation on First Day, and Depression as the top three most influential predictors, besides the Length of Stay in ICU and Hospital.Conclusions: This study addresses key gaps by conducting a comprehensive evaluation of classifiers and feature selection methods for predicting in-hospital mortality in patients with sepsis. The proposed SMOTE-(RF-RFE)-XGB model achieved a high predictive performance and stability with a compact feature set. APS III, Ventilation on First Day, and Depression were consistently identified as key predictors besides Length of Stay in ICU and Hospital.

Topik & Kata Kunci

Penulis (3)

Y

Yidi Shao

K

Kangjun Wang

Y

Yu Ma

Format Sitasi

Shao, Y., Wang, K., Ma, Y. (2026). A machine learning model for mortality risk prediction of sepsis patients based on the medical information mart for intensive care III database. https://doi.org/10.1016/j.engmed.2025.100118

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1016/j.engmed.2025.100118
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.1016/j.engmed.2025.100118
Akses
Open Access ✓