gms | German Medical Science

66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

26. - 30.09.2021, online

Developing a machine learning workflow to identify noisy data in early Alzheimer’s disease detection based on Shapley values

Meeting Abstract

Suche in Medline nach

  • Louise Bloch - Fachhochschule Dortmund, Fachbereich Informatik, Dortmund, Germany; Uniklinikum Essen, Institut für Medizinische Informatik, Biometrie und Epidemiologie (IMIBE), Essen, Germany
  • Christoph M. Friedrich - Fachhochschule Dortmund, Fachbereich Informatik, Dortmund, Germany; Uniklinikum Essen, Institut für Medizinische Informatik, Biometrie und Epidemiologie (IMIBE), Essen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 26.-30.09.2021. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 193

doi: 10.3205/21gmds018, urn:nbn:de:0183-21gmds0183

Veröffentlicht: 24. September 2021

© 2021 Bloch et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe



Introduction: The identification of whether subjects with Mild Cognitive Impairment (MCI) will prospectively develop Alzheimer’s Disease (AD) is important to recruit subjects for therapy studies [1]. Machine Learning (ML) can help to improve early AD detection [2], [3]. However, AD is a heterogeneous disease [4] and the variability of AD datasets is increased by multicentric study designs, varying Magnetic Resonance Imaging (MRI) acquisition protocols, and errors in MRI preprocessing. The variability increases the risk of overfitting for ML models, which may fail to differentiate between disease heterogeneity and noise [5]. This research investigates whether an automatic data analysis based on Shapley values [6] can identify subjects with noisy data, exclude them from the training set and improve ML models. A similar approach [7] was previously applied for pneumonia detection, resulting in improved results.

Methods: An ML workflow for AD detection was implemented using the programming language python [8]. All models classified between stable MCI (sMCI) and progressive MCI (pMCI) subjects using age, gender, the number of ApolipoproteinEε4 (ApoEε4) alleles, three cognitive tests, and MRI volumes. The training set included 467 subjects (260 sMCI, 207 pMCI) of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [9]. A Random Forest (RF) [10] feature selection reduced the MRI feature set. Data Shapley [11] was used to identify subjects with noisy data. The model selection was based on an independent validation dataset containing 108 ADNI subjects (60 sMCI, 48 pMCI). RF and eXtreme Gradient Boosting (XGBoost) [12] classifiers performed the final classification. All models were validated for an independent ADNI test set containing 144 subjects (80 sMCI, 64 pMCI) and an external subset of the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) [13] containing 28 subjects (16 sMCI, 12 pMCI). Kernel SHapley Additive exPlanations (SHAP) [14] were used to interpret those black-box models.

Results: The RF feature selection chose MRI volumes that were previously associated with AD [15] (e.g., hippocampus, entorhinal cortex, amygdala). Data Shapley was compared to random and Leave-One-Out [16] exclusion and outperformed both methods and the base models trained on the entire training set. The RF models which excluded those 134 training subjects with the smallest data Shapley values outperformed the base models which reached a mean accuracy of 62.64 % by 5.76 % (3.61 percentage points) for the ADNI test set. Data Shapley values were associated with features that were important in AD detection [15], [17], [18]. sMCI subjects with bad cognitive test scores, presence of ApoEε4 alleles, and small brain volumes achieved small data Shapley values. The opposite pattern was observed for the pMCI group. SHAP summary plots mainly showed less complex ML models for noise-reduced training sets.

Discussion: The noise reduction using data Shapley values improved the trained ML models. However, this method requires the careful consideration of training performance and generalizability and between overfitting and selection bias. Thus, it is important to repeat those results on larger AD datasets.

Conclusion: Overall, data Shapley was successfully applied to early AD detection and thus showed improved accuracies.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM, et al. Toward defining the preclinical stages of Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s & Dementia. 2011;7(3):280–92. DOI: 10.1016/j.jalz.2011.03.003 Externer Link
Pellegrini E, Ballerini L, Valdes Hernandez MDC, Chappell FM, González‐Castro V, Anblagan D, et al. Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: A systematic review. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring. 2018;10(1):519–35. DOI: 10.1016/j.dadm.2018.07.004 Externer Link
Jo T, Nho K, Saykin AJ. Deep learning in Alzheimer’s disease: Diagnostic classification and prognostic prediction using neuroimaging data. Frontiers in Aging Neuroscience. 2019;11:220. DOI: 10.3389/fnagi.2019.00220. Externer Link
Ferreira D, Verhagen C, Hernández-Cabrera JA, Cavallin L, Guo CJ, Ekman U, et al. Distinct subtypes of Alzheimer’s disease based on patterns of brain atrophy: Longitudinal trajectories and clinical applications. Scientific Reports. 2017;7(1). DOI: 10.1038/srep46263 Externer Link
Hawkins DM. The problem of overfitting. Journal of Chemical Information and Computer Sciences. 2004;44(1):1–12. DOI: 10.1021/ci0342472 Externer Link
Shapley L. 17. A value for n-person games. In: Kuhn H, Tucker AW, editors. Contributions to the Theory of Games. AM-28. Princeton: Princeton University Press; 1952. p. 307-17. DOI: 10.1515/9781400881970-018 Externer Link
Tang S, Ghorbani A, Yamashita R, Rehman S, Dunnmon JA, Zou J, et al. Data valuation for medical imaging using Shapley value and application to a large-scale chest X-ray dataset. Scientific Reports. 2021;11:8366. DOI: 10.1038/s41598-021-87762-2 Externer Link
Van Rossum G, Drake FL. Python 3 reference manual [Internet]. Python Software Foundation; 2009 [cited 2021-05-07]. Available from: Externer Link
Petersen RC, Aisen PS, Beckett LA, Donohue MC, Gamst AC, Harvey DJ, et al. Alzheimer’s Disease Neuroimaging Initiative (ADNI): Clinical characterization. Neurology. 2009;74(3):201–9. DOI: 10.1212/wnl.0b013e3181cb3e25 Externer Link
Breiman L. Random Forests. Machine Learning. 2001;45:5–32. DOI: 10.1023/A:1010933404324 Externer Link
Ghorbani A, Zou J. Data Shapley: Equitable valuation of data for machine learning. In: Chaudhuri K, Salakhutdinov R, editors. Proceedings of the 36th International Conference on Machine Learning (ICML 2019); 2019 June 9-15; Long Beach, California, US. 2019. (PMLR; 97). p. 2242-51. Available from: Externer Link
Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016); 2016 Aug 13-17; San Francisco, California, US. New York: ACM; 2016. p. 785-94. DOI: 10.1145/2939672.2939785 Externer Link
Ellis KA, Bush AI, Darby D, De Fazio D, Foster J, Hudson P, et al. The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging: Methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of Alzheimer's disease. International Psychogeriatrics. 2009;21(4):672–87. DOI: 10.1017/S1041610209009405 Externer Link
Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, editors. Advances in Neural Information Processing Systems (NIPS 2017); 2017 Dec 4-9; Long Beach, California, US. Curran Associates; 2017. p. 4765-74. Available from: Externer Link
Frisoni GB, Fox NC, Jack CR, Scheltens P, Thompson PM. The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology. 2010;6(2):67–77. DOI: 10.1038/nrneurol.2009.215 Externer Link
Cook RD. Detection of influential observation in linear regression. Technometrics. 1977;1:15. DOI: 10.2307/1268249 Externer Link
Elias-Sonnenschein LS, Viechtbauer W, Ramakers IHGB,Verhey FRJ, Visser PJ. Predictive value of APOE-\u949 ?4 allele for progression from MCI to AD-type dementia: A meta-analysis. Journal of Neurology, Neurosurgery & Psychiatry. 2011;82(10):1149–56. DOI: 10.1136/jnnp.2010.231555 Externer Link
Arevalo-Rodriguez I, Smailagic N, Roqué i Figuls M, Ciapponi A, Sanchez-Perez E, Giannakou A, et al. Mini-Mental State Examination (MMSE) for the detection of Alzheimer’s disease and other dementias in people with mild cognitive impairment (MCI). Cochrane Database of Systematic Reviews. 2015. DOI: 10.1002/14651858.cd010783.pub2 Externer Link