Artikel
A machine learning and feature selection pipeline for sepsis prediction on intensive care unit admission
Suche in Medline nach
Autoren
| Veröffentlicht: | 6. September 2024 |
|---|
Gliederung
Text
Introduction: Sepsis is a significant complication and leading cause of death in Intensive Care Units (ICUs). Advanced predictive approaches are sought to improve patient outcomes.
This study investigates machine learning (ML) algorithms’ capabilities to predict sepsis using single time-point on-admission surgical ICU patient data. Additionally, it assesses the potential to reduce the number of features needed to deliver accurate predictions.
Methods: We analyzed the electronic medical records of 928 patients admitted to University Medical Center Mannheim’s surgical ICU, 2016-2022. We included 52 features from comprehensive routine clinical monitoring and documentation comprising missing-free vital signs, lab results, clinical scores, and SIRS-descriptors [1]. We used sepsis labels from the Ground Truth for Sepsis Questionnaire [2] filled in daily by expert intensivists to diagnose sepsis as outcome.
We explored the potential of five established supervised ML-algorithms previously showing good classification performance, explainability and interpretability: Random Forest, Support Vector Machine, Extreme Gradient Boosting, Ridge Regression, and Logistic Regression to predict sepsis.
We separated 10% of the data for final validation, and then performed a 10-fold cross-validation for feature selection. These steps yielded two feature sets: the top 10 most selected features across all algorithms and the optimal feature set selected by a single strategy determined by the highest AUPRC among the folds.
To evaluate our resulting two sets of features, we retrained the ML algorithm with the highest average AUPRC from the full training dataset, and tested it on the previously separated 10% hold-out. We measured the performance of both models with AUROC, AUPRC, confusion matrix, and true positive rate (TPR) with respect to time of sepsis onset.
Results: The 10 most selected features included laboratory tests, dimensions of the Sequential Organ Failure Assessment (SOFA) score, and a SIRS-descriptor. The second set of selected features contained laboratory tests, one SOFA score dimension, and two SIRS-descriptors. On the hold-out, the full feature set predicted sepsis with an AUROC 0.7039 +/- 0.020 standard deviation. The top 10 most selected features yielded a greater AUROC 0.7308 +/- 0.018, and the second selected set 0.7493 +/- 0.015. The mean AUPRC was similar for all three feature sets. The highest TPR was 74%, while for patients developing sepsis within the first 4 days, TPR was higher at 78%.
Discussion: Classification performance of the five ML algorithms was similar on average; therefore, we cannot suggest a single best model for sepsis prediction. We expect the performance of our selected feature sets to differ in external validation. Thus, they rather serve as starting point for further model development. Yet, we show that our feature selection strategies improve correct sepsis prediction and reduce the required measurements for classification. Our models perform better for short-term predictions, suggesting the need for repeated parameter measurements and reclassification during ICU care to improve the recognition of a dynamically changing sepsis risk.
Conclusion: Predicting sepsis using on-admission ICU data remains an important challenge, yet using a data-driven subset of features has the prospect of earlier intervention or closer monitoring of patients with a high risk of sepsis.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Lindner HA, Balaban U, Sturm T, Weiss C, Thiel M, Schneider-Lindner V. An algorithm for systemic inflammatory response syndrome criteria-based prediction of sepsis in a polytrauma cohort. Crit Care Med. 2016;44(12):2199-207.
- 2.
- Lindner HA, Schamoni S, Kirschning T, Worm C, Hahn B, Centner FS, et al. Ground truth labels challenge the validity of sepsis consensus definitions in critical illness. J Transl Med. 2022;20(1):27.
