gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

LeukoExpert: Early detection of leukodystrophy patients

Meeting Abstract

  • Lars Hempel - Dept. Medical Data Science, Leipzig University Medical Center, Leipzig, Germany; IMISE, Universität Leipzig, Dresden/Leipzig, Germany; Hochschule Mittweida, Mittweida, Germany
  • Julia Lier - Leipzig University Medical Center, Department of Neurology, Leipzig, Germany
  • Christa-Caroline Bergner - Leipzig University Medical Center, Department of Neurology, Leipzig, Germany
  • Wolfgang Köhler - Leipzig University Medical Center, Department of Neurology, Leipzig, Germany
  • David Hiebers - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany
  • Marius de arruda botelho Herr - Medical Data Integration Center (meDIC), University Hospital Tübingen, Tübingen, Germany
  • Nadine Weissert - Tübingen University Medical Center, Department of Neurology, Tübingen, Germany
  • Ludger Schöls - Universitätsklinikum Tübingen, Tübingen, Germany
  • Samuel Gröschel - Tübingen University Medical Center, Department for pediatric abd adolescent medicine, Tübingen, Germany
  • Sascha Welten - RWTH Aachen University, Chair of Computer Science, Aachen, Germany
  • Christopher Schippers - University Medical Center, Center for Rare Diseases and Department of Digitalization and General Practice, Aachen, Germany
  • Jan Wienströer - RWTH Aachen University, Aachen, Germany
  • Victoria Witzig - University Medical Center, Department of Neurology, Aachen, Germany
  • Andrea Maier - University Medical Center, Department of Neurology, Aachen, Germany
  • Toralf Kirsten - Institut für Medizinische Informatik, Statistik und Epidemiologie, Universität Leipzig, Leipzig, Germany; Dept. Medical Data Science, Leipzig University Medical Center, Leipzig, Germany; University Leipzig, Medical Center, Leipzig, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 910

doi: 10.3205/24gmds139, urn:nbn:de:0183-24gmds1397

Published: September 6, 2024

© 2024 Hempel et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Leukodystrophies (LD) are a group of rare neurological diseases with a genetic predisposition. The prevalence of LD ranges from 1:40,000 to 1,100,000 depending on approximately 60 subtypes [1]. Due to the rare occurrence of the disease and its similarity in the early stages to other neurological diseases such as multiple sclerosis as a differential diagnosis (DD), misdiagnosis occurs frequently [2]. Therefore, LD patients suffer from a long journey before the correct diagnosis is made. There are several early stage gene therapies available requiring a clear and precise diagnosis as earliest as possible. In Germany, there are two medical expert centers (Leipzig and Tübingen) where LD patients are examined, diagnosed, and treated. The goal of this work is to create a classification model to distinguish between LD and DD to address the challenge to guide all potential LD patients to the expert centers.

Methods: With the LeukoExpert project, we have designed and established a distributed LD registry allowing us to collect data on LD patients in both expert centers separately. We added a third instance at the UK Aachen for all DD. All registry instances use the almost similar schema for interoperability reasons and are implemented with Research Electronic Data Capture (REDCap) [3]. The Registries contain structured data such as basic demographic data, medical history, and examination data from different time points of the patient. The registry contains more than 850 patients; 500 (350) patients with LD (DD) covering 28 (4) LD subtypes (DD). The Personal Health Train (PHT) is used to analyze captured data in a distributed mode, i.e., analysis algorithms are shipped to the data integration center in all three centers managing the registry instances. The incremental distributed analysis focuses on the differentiation between LD and DD in the first 5 years after symptoms onset. We applied Naive Bayes (NB), Linear Regression (LR), Global Boosting Classifier (GBC), Random Forest (RF), and MLP for the binary classification. We split the data ten times into 80% training data and 20 % testing data.

Results: For the binary classification (LD vs. DD), we achieved a mean accuracy of 81% for RF, 80% for GBC, 79% for LR, 61% for NB and 57% for MLP. Furthermore, the results contain a ranked list of symptoms, i.e., symptoms like spasticity and gait disturbance are important for decision, and information about family medical history.

Discussion: The use of a distributed registry approach is innovative and provides new opportunities, but also comes with challenges. In comparison to other studies using MRIs [4], the accuracy is reduced while the number of patients is much higher. First classification results show a good performance and the extracted features are in line with medical expertise. Further improvement, like generalization of symptoms, should be applied to increase the accuracy to a level where the model could be used in the daily routine.

Conclusion: We established a distributed registry and analyzed the captured data therein to support clinicians in their diagnostic procedures to distinguish between LD and DD.

The authors declare that they have no competing interests.

The authors declare that a positive ethics committee vote has been obtained.


References

1.
Wasserstein MP, Andriola M, Arnold G, Aron A, Duffner P, Erbe RW, et al. Clinical outcomes of children with abnormal newborn screening results for Krabbe disease in New York State. Genetics in Medicine. 2016;18(12):1235–43.
2.
Costello DJ, Eichler AF, Eichler FS. Leukodystrophies: Classification, Diagnosis, and Treatment. The Neurologist. 2009;15(6):319–28. DOI: 10.1097/NRL.0b013e3181b287c8 External link
3.
Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics. 2019;95:103208. DOI: 10.1016/j.jbi.2019.103208 External link
4.
Mangeat G, Ouellette R, Wabartha M, De Leener B, Plattén M, Danylaité Karrenbauer V, et al. Machine Learning and Multiparametric Brain MRI to Differentiate Hereditary Diffuse Leukodystrophy with Spheroids from Multiple Sclerosis. Journal of Neuroimaging. 2020;30(5):674–82. DOI: 10.1111/jon.12725 External link