Article
Data-driven prediction of postoperative clinical outcome using the Neurologic Assessment in Neuro-Oncology (NANO) score in glioblastoma patients – the clinical usefulness of a black-box machine learning model
Datengetriebene Vorhersage des postoperativen klinischen Outcomes unter Verwendung des Neurologic Assessment in Neuro-Oncology (NANO)-Scores bei Glioblastompatienten – die klinische Nützlichkeit eines Black-Box Machine Learning Models
Search Medline for
Authors
Published: | June 4, 2021 |
---|
Outline
Text
Objective: For patients with glioblastoma (GBM), postoperative neurological deterioration can markedly compromise the quality of life and reduce overall survival. The Neurologic Assessment in Neuro-Oncology (NANO) score was proposed for the standardized assessment of neurologic function, but its accurate prediction remains challenging. Artificial intelligence-based methods offer patient-tailored predictive analytics for outcomes in neurosurgery, but they often remain black-box models for the sake of maximizing performance over interpretability. We compare a logistic regression (LR) with a neural network (NN) for clinically relevant outcome predictions and discuss their usefulness in personalized medicine.
Methods: Data included 229 patients (mean [SD] age 62 [11] years; 88 female) in total, with a preoperative NANO score of mean 2.3 [2.1], and mean 2.4 [2.4] postoperatively. Clinically relevant postoperative deterioration was defined as NANO≥3. Data were randomly split into a development set (80%) and a validation set (20%). Generalizability was evaluated in 1000 bootstrap iterations on the validation set.
Results: The predictive performance was determined by comparing the predicted with the actual neurologic deterioration, which resulted in an area-under-the-curve (AUC) value of 0.84 [95% CI 0.73- 0.93] for LR (Figure 1A [Fig. 1]), with a precision and recall of 0.85 [0.76-0.92] and 0.83 (0.74-0.91). The NN performed better: AUC 0.85 [0.76 - 0.93], precision 0.78 [0.69-0.88] and recall 0.76 (0.67-0.87) (Figure 1B [Fig. 1]). Based on AUC alone, the NN is superior; however, considering precision and recall it is outperformed by the LR. Further, only the LR is inherently interpretable and offers insights into the inference of the included features (Figure 1A [Fig. 1] lower part). This makes it more useful in the clinical setting and highlights the influence of preoperative NANO, ventricular and midline infiltration, as well as eloquence for postoperative predictions.
Conclusion: An AI-based neuronal network was successfully applied to predict postoperative NANO after GBM resection. While maximizing performance, the NN lacks measures of interpretability, and can generally be seen as a black-box model. In contrast, LR offers insights into the model’s generative process and performs almost equally. As high-stake clinical decisions require both accuracy and understanding of how the prediction works, the usefulness of black-box models seems to be limited and needs further development for successful clinical application.