gms | German Medical Science

68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

17.09. - 21.09.23, Heilbronn

Framework for Federated Artificial Intelligence for the Optimization of Pancreatic Cancer Treatment

Meeting Abstract

  • Youngjun Park - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Jonas Hügel - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Nils Beyer - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Sophia Rheinländer - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
  • Hryhorii Chereda - Department of Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany
  • Lisa Fricke - Rechts der Isar Hospital, Technical University Munich, Munich, Germany; Collaborative Research Centre 1321 - Modelling and Targeting Pancreatic Cancer, Munich, Germany
  • Martin Middeke - Comprehensive Cancer Center Marburg am Universitätsklinikum Gießen und Marburg GmbH, Marburg, Germany; Clinical Research Unit 325 "The Clinical Relevance of the Tumor Microenvironment - Interactions in Pancreatic Ductal Adenocarcinoma PDAC, Marburg, Germany
  • Max Reichert - Rechts der Isar Hospital, Technical University Munich, Munich, Germany; Collaborative Research Centre 1321 - Modelling and Targeting Pancreatic Cancer, Munich, Germany
  • Malte Buchholz - Klinik für Innere Medizin/SP Gastroenterologie, Zentrum für Tumor- und Immunbiologie, Philipps-Universität Marburg, Marburg, Germany
  • Matthias Lauth - Klinik für Innere Medizin/SP Gastroenterologie, Zentrum für Tumor- und Immunbiologie, Philipps-Universität Marburg, Marburg, Germany; Clinical Research Unit 325 "The Clinical Relevance of the Tumor Microenvironment - Interactions in Pancreatic Ductal Adenocarcinoma PDAC, Marburg, Germany
  • Günter Schneider - Department of Gastroenterolgy, University Medical Center Göttingen, Germany, Göttingen, Germany
  • Elisabeth Hessmann - Department of Gastroenterolgy, University Medical Center Göttingen, Germany, Göttingen, Germany
  • Tim Beißbarth - Department of Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany; Clinical Research Unit 5002, KFO5002, University Medical Center Göttingen, Göttingen, Germany; Campus Institute Data Science, Georg-August-University Göttingen, Göttingen, Germany
  • Ulrich Sax - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany; Campus Institute Data Science, Georg-August-University Göttingen, Göttingen, Germany
  • Anne-Christin Hauschild - Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany; Campus Institute Data Science, Georg-August-University Göttingen, Göttingen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS). Heilbronn, 17.-21.09.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. DocAbstr. 334

doi: 10.3205/23gmds006, urn:nbn:de:0183-23gmds0069

Published: September 15, 2023

© 2023 Park et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Research has shown AI models to be highly effective in predicting medical phenotypes such as disease prognosis or treatment response, often outperforming standard inference [1], [2]. However, studies integrating medical registry and omics data are often limited by sample size and systematic biases of single cohorts or non-independent, identically distributed (non-IID) and partially-non-overlapping (PNO) data in multi-cohort studies and thus may lack model robustness, hindering the transition towards clinical practice. This becomes particularly apparent when investigating complex oncological diseases such as pancreatic ductal adenocarcinoma (PDAC), presenting an extraordinarily aggressive, locally invasive tumour biology, a tendency to distant metastases, stromal dependent tumour growth [3], and the exceptionally high and heterogeneous resistance to conventional chemotherapy. This aggressive malignancy with a rising incidence is predicted to become the second leading cause of cancer-related death by 2030 in the industrialised world.

State of the art: While a data-centralised integration of multiple cohorts and subsequent model training can aid to overcome the issues of small, biassed data, such methods are often prohibited by legal patient privacy regulations. Federated Artificial Intelligence (FAI) approaches developed for such circumstances are able to aggregate locally trained machine learning models without sharing distributed data [4]. Widely used in commercial FAI applications, research only recently started adapting FAI towards biomedical applications [5].

Concept & implementation: Here we will present the FAIrPaCT consortium, consisting of the University Medical Center Göttingen, the University Hospital Giessen and Marburg and the Rechts der Isar Hospital, Technical University Munich. Our goal is to develop a software system supported by federated artificial intelligence called FAIrPaCT that will enable the analysis of clinical patient data and molecular cancer cell data from patients with pancreatic cancer across institutes. Our project combines three of the largest patient cohorts (KFO5002, KFO325, SFB1321) on pancreatic cancer in Germany, which are unique in size and heterogeneity.

While all datasets adhere to good scientific practice concerning reproducibility and are well suited for local analysis major efforts are required to map the challenging data, suffering from heterogeneity and nonuniform nomenclature, non-IIDnes and site specific information into a common information model. In particular, we will build a data management (DM) framework encompassing the Medical Informatics Initiative’s common data model in combination with PDAC specific extension modules harmonising ontologies.

Moreover, we will develop FAI algorithms based on Federated Deep Neural Networks and Federated Random Forest, enable them to tackle challenges such as non-IIDnes and PNO, and tailor these to potentially privacy-sensitive cancer-registry and biomedical patient data. Federated AI techniques aim to build a generalised global model by aggregating strictly locally trained models and therefore require a fundamentally different privacy-by-design architecture. Moreover, we will evaluate the hardware requirements of different FAI algorithms and the subsequent feasibility of their application within the clinical infrastructure. We will benchmark the developed FAI algorithms to current state-of-the-art approaches. The most promising strategies that are non-IID and PNO-ready and adhere to the defined hardware requirements are integrated into the FAI framework.

Finally, we will develop and integrate xAI and bioinformatics strategies that foster the identification of PDCA specific omics and clinical markers as well as molecular pathomechanisms substantial to PDAC progression and treatment response that remain hidden when separately analysing local datasets.

In conclusion, FAIrPaCT aims to develop a tailored federated artificial intelligence framework that can aid the research and clinical community to move towards personalised treatment. The FAIrPaCT framework will be available as open access project.

Acknowledgements: We are very thankful for the BMBF funding in this project (Förderkennzeichen BMBF 01KD2208A). Our Ethics amendment based on the previous projects is in preparation (May 2023).

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Fatima M, Pasha M. Survey of machine learning algorithms for disease  diagnostic. Journal of Intelligent Learning Systems and Applications. 2017;9(01):1.
2.
Beinecke JM, Anders P, Schurrat T, Heider D, Luster M, Librizzi D,  Hauschild AC. Evaluation of machine learning strategies for imaging  confirmed prostate cancer recurrence prediction on electronic health  records. Computers in Biology and Medicine. 2022 Apr 1;143:105263.
3.
Hupfer A, Brichkina A, Koeniger A, Keber C, Denkert C, Pfefferle P,  Helmprobst F, Pagenstecher A, Visekruna A, Lauth M. Matrix stiffness  drives stromal autophagy and promotes formation of a protumorigenic  niche. Proceedings of the National Academy of Sciences. 2021 Oct  5;118(40):e2105367118.
4.
Yang Q, Liu Y, Chen T, Tong Y. Federated machine learning: Concept and  applications. ACM Transactions on Intelligent Systems and Technology  (TIST). 2019 Jan 28;10(2):1-9.
5.
Hauschild AC, Lemanczyk M, Matschinske J, Frisch T, Zolotareva O,  Holzinger A, Baumbach J, Heider D. Federated Random Forests can improve  local performance of predictive models for various healthcare  applications. Bioinformatics. 2022 Apr 15;38(8):2278-86.